Skip to content

Latest commit

 

History

History
1665 lines (1642 loc) · 110 KB

prebuilt-indexes.md

File metadata and controls

1665 lines (1642 loc) · 110 KB

Pyserini: Prebuilt Indexes

Pyserini provides a number of pre-built Lucene indexes. To list what's available in code:

from pyserini.search.lucene import LuceneSearcher
LuceneSearcher.list_prebuilt_indexes()

from pyserini.index.lucene import IndexReader
IndexReader.list_prebuilt_indexes()

It's easy initialize a searcher from a pre-built index:

searcher = LuceneSearcher.from_prebuilt_index('robust04')

You can use this simple Python one-liner to download the pre-built index:

python -c "from pyserini.search.lucene import LuceneSearcher; LuceneSearcher.from_prebuilt_index('robust04')"

The downloaded index will be in ~/.cache/pyserini/indexes/.

It's similarly easy initialize an index reader from a pre-built index:

index_reader = IndexReader.from_prebuilt_index('robust04')
index_reader.stats()

The output will be:

{'total_terms': 174540872, 'documents': 528030, 'non_empty_documents': 528030, 'unique_terms': 923436}

Note that unless the underlying index was built with the -optimize option (i.e., merging all index segments into a single segment), unique_terms will show -1. Nope, that's not a bug.

Below is a summary of the pre-built indexes that are currently available. Detailed configuration information for the pre-built indexes are stored in pyserini/prebuilt_index_info.py.

Standard Lucene Indexes

msmarco-v1-doc [readme]
Lucene index of the MS MARCO V1 document corpus. (Lucene 9)
msmarco-v1-doc-slim [readme]
Lucene index of the MS MARCO V1 document corpus ('slim' version). (Lucene 9)
msmarco-v1-doc-full [readme]
Lucene index of the MS MARCO V1 document corpus ('full' version). (Lucene 9)
msmarco-v1-doc-d2q-t5 [readme]
Lucene index of the MS MARCO V1 document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v1-doc-d2q-t5-docvectors [readme]
Lucene index (+docvectors) of the MS MARCO V1 document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v1-doc-segmented [readme]
Lucene index of the MS MARCO V1 segmented document corpus. (Lucene 9)
msmarco-v1-doc-segmented-slim [readme]
Lucene index of the MS MARCO V1 segmented document corpus ('slim' version). (Lucene 9)
msmarco-v1-doc-segmented-full [readme]
Lucene index of the MS MARCO V1 segmented document corpus ('full' version). (Lucene 9)
msmarco-v1-doc-segmented-d2q-t5 [readme]
Lucene index of the MS MARCO V1 segmented document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v1-doc-segmented-d2q-t5-docvectors [readme]
Lucene index (+docvectors) of the MS MARCO V1 segmented document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v1-passage [readme]
Lucene index of the MS MARCO V1 passage corpus. (Lucene 9)
msmarco-v1-passage-slim [readme]
Lucene index of the MS MARCO V1 passage corpus ('slim' version). (Lucene 9)
msmarco-v1-passage-full [readme]
Lucene index of the MS MARCO V1 passage corpus ('full' version). (Lucene 9)
msmarco-v1-passage-d2q-t5 [readme]
Lucene index of the MS MARCO V1 passage corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v1-passage-d2q-t5-docvectors [readme]
Lucene index (+docvectors) of the MS MARCO V1 passage corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-passage-ltr [readme]
Lucene index of the MS MARCO passage corpus with four extra preprocessed fields for LTR. (Lucene 8)
msmarco-doc-per-passage-ltr
Lucene index of the MS MARCO document per-passage corpus with four extra preprocessed fields for LTR. (Lucene 8)
msmarco-document-segment-ltr
Lucene index of the MS MARCO document segmented corpus with four extra preprocessed fields for LTR. (Lucene 8)
msmarco-v2-doc [readme]
Lucene index of the MS MARCO V2 document corpus. (Lucene 9)
msmarco-v2-doc-slim [readme]
Lucene index of the MS MARCO V2 document corpus ('slim' version). (Lucene 9)
msmarco-v2-doc-full [readme]
Lucene index of the MS MARCO V2 document corpus ('full' version). (Lucene 9)
msmarco-v2-doc-d2q-t5 [readme]
Lucene index of the MS MARCO V2 document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v2-doc-d2q-t5-docvectors [readme]
Lucene index (+docvectors) of the MS MARCO V2 document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v2-doc-segmented [readme]
Lucene index of the MS MARCO V2 segmented document corpus. (Lucene 9)
msmarco-v2-doc-segmented-slim [readme]
Lucene index of the MS MARCO V2 segmented document corpus ('slim' version). (Lucene 9)
msmarco-v2-doc-segmented-full [readme]
Lucene index of the MS MARCO V2 segmented document corpus ('full' version). (Lucene 9)
msmarco-v2-doc-segmented-d2q-t5 [readme]
Lucene index of the MS MARCO V2 segmented document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v2-doc-segmented-d2q-t5-docvectors [readme]
Lucene index (+docvectors) of the MS MARCO V2 segmented document corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v2-passage [readme]
Lucene index of the MS MARCO V2 passage corpus. (Lucene 9)
msmarco-v2-passage-slim [readme]
Lucene index of the MS MARCO V2 passage corpus ('slim' version). (Lucene 9)
msmarco-v2-passage-full [readme]
Lucene index of the MS MARCO V2 passage corpus ('full' version). (Lucene 9)
msmarco-v2-passage-d2q-t5 [readme]
Lucene index of the MS MARCO V2 passage corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v2-passage-d2q-t5-docvectors [readme]
Lucene index (+docvectors) of the MS MARCO V2 passage corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v2-passage-augmented [readme]
Lucene index of the MS MARCO V2 augmented passage corpus. (Lucene 9)
msmarco-v2-passage-augmented-slim [readme]
Lucene index of the MS MARCO V2 augmented passage corpus ('slim' version). (Lucene 9)
msmarco-v2-passage-augmented-full [readme]
Lucene index of the MS MARCO V2 augmented passage corpus ('full' version). (Lucene 9)
msmarco-v2-passage-augmented-d2q-t5 [readme]
Lucene index of the MS MARCO V2 augmented passage corpus with doc2query-T5 expansions. (Lucene 9)
msmarco-v2-passage-augmented-d2q-t5-docvectors [readme]
Lucene index (+docvectors) of the MS MARCO V2 augmented passage corpus with doc2query-T5 expansions. (Lucene 9)
beir-v1.0.0-trec-covid.flat [readme]
Lucene flat index of BEIR (v1.0.0): TREC-COVID
beir-v1.0.0-bioasq.flat [readme]
Lucene flat index of BEIR (v1.0.0): BioASQ
beir-v1.0.0-nfcorpus.flat [readme]
Lucene flat index of BEIR (v1.0.0): NFCorpus
beir-v1.0.0-nq.flat [readme]
Lucene flat index of BEIR (v1.0.0): NQ
beir-v1.0.0-hotpotqa.flat [readme]
Lucene flat index of BEIR (v1.0.0): HotpotQA
beir-v1.0.0-fiqa.flat [readme]
Lucene flat index of BEIR (v1.0.0): FiQA-2018
beir-v1.0.0-signal1m.flat [readme]
Lucene flat index of BEIR (v1.0.0): Signal-1M
beir-v1.0.0-trec-news.flat [readme]
Lucene flat index of BEIR (v1.0.0): TREC-NEWS
beir-v1.0.0-robust04.flat [readme]
Lucene flat index of BEIR (v1.0.0): Robust04
beir-v1.0.0-arguana.flat [readme]
Lucene flat index of BEIR (v1.0.0): ArguAna
beir-v1.0.0-webis-touche2020.flat [readme]
Lucene flat index of BEIR (v1.0.0): Webis-Touche2020
beir-v1.0.0-cqadupstack-android.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-android
beir-v1.0.0-cqadupstack-english.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-english
beir-v1.0.0-cqadupstack-gaming.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-gaming
beir-v1.0.0-cqadupstack-gis.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-gis
beir-v1.0.0-cqadupstack-mathematica.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-mathematica
beir-v1.0.0-cqadupstack-physics.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-physics
beir-v1.0.0-cqadupstack-programmers.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-programmers
beir-v1.0.0-cqadupstack-stats.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-stats
beir-v1.0.0-cqadupstack-tex.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-tex
beir-v1.0.0-cqadupstack-unix.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-unix
beir-v1.0.0-cqadupstack-webmasters.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-webmasters
beir-v1.0.0-cqadupstack-wordpress.flat [readme]
Lucene flat index of BEIR (v1.0.0): CQADupStack-wordpress
beir-v1.0.0-quora.flat [readme]
Lucene flat index of BEIR (v1.0.0): Quora
beir-v1.0.0-dbpedia-entity.flat [readme]
Lucene flat index of BEIR (v1.0.0): DBPedia
beir-v1.0.0-scidocs.flat [readme]
Lucene flat index of BEIR (v1.0.0): SCIDOCS
beir-v1.0.0-fever.flat [readme]
Lucene flat index of BEIR (v1.0.0): FEVER
beir-v1.0.0-climate-fever.flat [readme]
Lucene flat index of BEIR (v1.0.0): Climate-FEVER
beir-v1.0.0-scifact.flat [readme]
Lucene flat index of BEIR (v1.0.0): SciFact
beir-v1.0.0-trec-covid.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): TREC-COVID
beir-v1.0.0-bioasq.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): BioASQ
beir-v1.0.0-nfcorpus.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): NFCorpus
beir-v1.0.0-nq.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): NQ
beir-v1.0.0-hotpotqa.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): HotpotQA
beir-v1.0.0-fiqa.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): FiQA-2018
beir-v1.0.0-signal1m.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): Signal-1M
beir-v1.0.0-trec-news.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): TREC-NEWS
beir-v1.0.0-robust04.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): Robust04
beir-v1.0.0-arguana.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): ArguAna
beir-v1.0.0-webis-touche2020.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): Webis-Touche2020
beir-v1.0.0-cqadupstack-android.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-android
beir-v1.0.0-cqadupstack-english.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-english
beir-v1.0.0-cqadupstack-gaming.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-gaming
beir-v1.0.0-cqadupstack-gis.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-gis
beir-v1.0.0-cqadupstack-mathematica.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-mathematica
beir-v1.0.0-cqadupstack-physics.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-physics
beir-v1.0.0-cqadupstack-programmers.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-programmers
beir-v1.0.0-cqadupstack-stats.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-stats
beir-v1.0.0-cqadupstack-tex.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-tex
beir-v1.0.0-cqadupstack-unix.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-unix
beir-v1.0.0-cqadupstack-webmasters.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-webmasters
beir-v1.0.0-cqadupstack-wordpress.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): CQADupStack-wordpress
beir-v1.0.0-quora.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): Quora
beir-v1.0.0-dbpedia-entity.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): DBPedia
beir-v1.0.0-scidocs.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): SCIDOCS
beir-v1.0.0-fever.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): FEVER
beir-v1.0.0-climate-fever.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): Climate-FEVER
beir-v1.0.0-scifact.multifield [readme]
Lucene multifield index of BEIR (v1.0.0): SciFact
mrtydi-v1.1-arabic [readme]
Lucene index for Mr.TyDi v1.1 (Arabic).
mrtydi-v1.1-bengali [readme]
Lucene index for Mr.TyDi v1.1 (Bengali).
mrtydi-v1.1-english [readme]
Lucene index for Mr.TyDi v1.1 (English).
mrtydi-v1.1-finnish [readme]
Lucene index for Mr.TyDi v1.1 (Finnish).
mrtydi-v1.1-indonesian [readme]
Lucene index for Mr.TyDi v1.1 (Indonesian).
mrtydi-v1.1-japanese [readme]
Lucene index for Mr.TyDi v1.1 (Japanese).
mrtydi-v1.1-korean [readme]
Lucene index for Mr.TyDi v1.1 (Korean).
mrtydi-v1.1-russian [readme]
Lucene index for Mr.TyDi v1.1 (Russian).
mrtydi-v1.1-swahili [readme]
Lucene index for Mr.TyDi v1.1 (Swahili).
mrtydi-v1.1-telugu [readme]
Lucene index for Mr.TyDi v1.1 (Telugu).
mrtydi-v1.1-thai [readme]
Lucene index for Mr.TyDi v1.1 (Thai).
miracl-v1.0-ar [readme]
Lucene index for MIRACL v1.0 (Arabic).
miracl-v1.0-bn [readme]
Lucene index for MIRACL v1.0 (Bengali).
miracl-v1.0-en [readme]
Lucene index for MIRACL v1.0 (English).
miracl-v1.0-es [readme]
Lucene index for MIRACL v1.0 (Spanish).
miracl-v1.0-fa [readme]
Lucene index for MIRACL v1.0 (Persian).
miracl-v1.0-fi [readme]
Lucene index for MIRACL v1.0 (Finnish).
miracl-v1.0-fr [readme]
Lucene index for MIRACL v1.0 (French).
miracl-v1.0-hi [readme]
Lucene index for MIRACL v1.0 (Hindi).
miracl-v1.0-id [readme]
Lucene index for MIRACL v1.0 (Indonesian).
miracl-v1.0-ja [readme]
Lucene index for MIRACL v1.0 (Japanese).
miracl-v1.0-ko [readme]
Lucene index for MIRACL v1.0 (Korean).
miracl-v1.0-ru [readme]
Lucene index for MIRACL v1.0 (Russian).
miracl-v1.0-sw [readme]
Lucene index for MIRACL v1.0 (Swahili).
miracl-v1.0-te [readme]
Lucene index for MIRACL v1.0 (Telugu).
miracl-v1.0-th [readme]
Lucene index for MIRACL v1.0 (Thai).
miracl-v1.0-zh [readme]
Lucene index for MIRACL v1.0 (Chinese).
miracl-v1.0-de [readme]
Lucene index for MIRACL v1.0 (German).
miracl-v1.0-yo [readme]
Lucene index for MIRACL v1.0 (Yoruba).
ciral-v1.0-ha [readme]
Lucene index for CIRAL v1.0 (Hausa).
ciral-v1.0-so [readme]
Lucene index for CIRAL v1.0 (Somali).
ciral-v1.0-sw [readme]
Lucene index for CIRAL v1.0 (Swahili).
ciral-v1.0-yo [readme]
Lucene index for CIRAL v1.0 (Yoruba).
cacm
Lucene index of the CACM corpus. (Lucene 9)
robust04 [readme]
Lucene index of TREC Disks 4 & 5 (minus Congressional Records), used in the TREC 2004 Robust Track. (Lucene 9)
enwiki-paragraphs
Lucene index of English Wikipedia for BERTserini
zhwiki-paragraphs
Lucene index of Chinese Wikipedia for BERTserini
trec-covid-r5-abstract
Lucene index for TREC-COVID Round 5: abstract index
trec-covid-r5-full-text
Lucene index for TREC-COVID Round 5: full-text index
trec-covid-r5-paragraph
Lucene index for TREC-COVID Round 5: paragraph index
trec-covid-r4-abstract
Lucene index for TREC-COVID Round 4: abstract index
trec-covid-r4-full-text
Lucene index for TREC-COVID Round 4: full-text index
trec-covid-r4-paragraph
Lucene index for TREC-COVID Round 4: paragraph index
trec-covid-r3-abstract
Lucene index for TREC-COVID Round 3: abstract index
trec-covid-r3-full-text
Lucene index for TREC-COVID Round 3: full-text index
trec-covid-r3-paragraph
Lucene index for TREC-COVID Round 3: paragraph index
trec-covid-r2-abstract
Lucene index for TREC-COVID Round 2: abstract index
trec-covid-r2-full-text
Lucene index for TREC-COVID Round 2: full-text index
trec-covid-r2-paragraph
Lucene index for TREC-COVID Round 2: paragraph index
trec-covid-r1-abstract
Lucene index for TREC-COVID Round 1: abstract index
trec-covid-r1-full-text
Lucene index for TREC-COVID Round 1: full-text index
trec-covid-r1-paragraph
Lucene index for TREC-COVID Round 1: paragraph index
cast2019
Lucene index for TREC 2019 CaST
wikipedia-dpr-100w [readme]
Lucene index of Wikipedia with DPR 100-word splits
wikipedia-dpr-100w-slim [readme]
Lucene index of Wikipedia with DPR 100-word splits (slim version, document text not stored)
wikipedia-kilt-doc [readme]
Lucene index of Wikipedia snapshot used as KILT's knowledge source.
wiki-all-6-3-tamber [readme]
Lucene index of wiki-all-6-3-tamber from castorini/odqa-wiki-corpora
hc4-v1.0-fa [readme]
Lucene index for HC4 v1.0 (Persian). (Lucene 9)
hc4-v1.0-ru [readme]
Lucene index for HC4 v1.0 (Russian). (Lucene 9)
hc4-v1.0-zh [readme]
Lucene index for HC4 v1.0 (Chinese). (Lucene 9)
neuclir22-fa [readme]
Lucene index for NeuCLIR 2022 corpus (Persian). (Lucene 9)
neuclir22-ru [readme]
Lucene index for NeuCLIR 2022 corpus (Russian). (Lucene 9)
neuclir22-zh [readme]
Lucene index for NeuCLIR 2022 corpus (Chinese). (Lucene 9)
neuclir22-fa-en [readme]
Lucene index for NeuCLIR 2022 corpus (official English translation from Persian). (Lucene 9)
neuclir22-ru-en [readme]
Lucene index for NeuCLIR 2022 corpus (official English translation from Russian). (Lucene 9)
neuclir22-zh-en [readme]
Lucene index for NeuCLIR 2022 corpus (official English translation from Chinese). (Lucene 9)
atomic_text_v0.2.1_small_validation [readme]
Lucene index for AToMiC Text v0.2.1 small setting on validation set (Lucene 9)
atomic_text_v0.2.1_base [readme]
Lucene index for AToMiC Text v0.2.1 base setting on validation set (Lucene 9)
atomic_text_v0.2.1_large [readme]
Lucene index for AToMiC Text v0.2.1 large setting on validation set (Lucene 9)
atomic_image_v0.2_small_validation [readme]
Lucene index for AToMiC Images v0.2 small setting on validation set (Lucene 9)
atomic_image_v0.2_base [readme]
Lucene index for AToMiC Images v0.2 base setting on validation set (Lucene 9)
atomic_image_v0.2_large [readme]
Lucene index for AToMiC Images v0.2 large setting on validation set (Lucene 9)

Lucene Impact Indexes

msmarco-v1-passage-slimr [readme]
Lucene impact index of the MS MARCO V1 passage corpus enoded by SLIM trained with BM25 negatives. (Lucene 9)
msmarco-v1-passage-slimr-pp [readme]
Lucene impact index of the MS MARCO V1 passage corpus enoded by SLIM trained with cross-encoder distillation and hardnegative mining. (Lucene 9)
msmarco-v1-passage-unicoil [readme]
Lucene impact index of the MS MARCO V1 passage corpus for uniCOIL. (Lucene 9)
msmarco-v1-passage-unicoil-noexp [readme]
Lucene impact index of the MS MARCO V1 passage corpus for uniCOIL (noexp). (Lucene 9)
msmarco-v1-passage-deepimpact [readme]
Lucene impact index of the MS MARCO passage corpus encoded by DeepImpact. (Lucene 9)
msmarco-v1-passage-unicoil-tilde [readme]
Lucene impact index of the MS MARCO passage corpus encoded by uniCOIL-TILDE. (Lucene 9)
msmarco-v1-passage-distill-splade-max [readme]
Lucene impact index of the MS MARCO passage corpus encoded by distill-splade-max. (Lucene 9)
msmarco-v1-passage-splade-pp-ed [readme]
Lucene impact index of the MS MARCO passage corpus encoded by SPLADE++ CoCondenser-EnsembleDistil. (Lucene 9)
msmarco-v1-passage-splade-pp-ed-docvectors [readme]
Lucene impact index (with docvectors) of the MS MARCO passage corpus encoded by SPLADE++ CoCondenser-EnsembleDistil. (Lucene 9)
msmarco-v1-passage-splade-pp-ed-text [readme]
Lucene impact index (with text) of the MS MARCO passage corpus encoded by SPLADE++ CoCondenser-EnsembleDistil. (Lucene 9)
msmarco-v1-passage-splade-pp-sd [readme]
Lucene impact index of the MS MARCO passage corpus encoded by SPLADE++ CoCondenser-SelfDistil. (Lucene 9)
msmarco-v1-passage-splade-pp-sd-docvectors [readme]
Lucene impact index (with docvectors) of the MS MARCO passage corpus encoded by SPLADE++ CoCondenser-SelfDistil. (Lucene 9)
msmarco-v1-passage-splade-pp-sd-text [readme]
Lucene impact index (with text) of the MS MARCO passage corpus encoded by SPLADE++ CoCondenser-SelfDistil. (Lucene 9)
msmarco-v1-doc-segmented-unicoil [readme]
Lucene impact index of the MS MARCO V1 segmented document corpus for uniCOIL, with title/segment encoding. (Lucene 9)
msmarco-v1-doc-segmented-unicoil-noexp [readme]
Lucene impact index of the MS MARCO V1 segmented document corpus for uniCOIL (noexp), with title/segment encoding. (Lucene 9)
msmarco-v2-passage-unicoil-0shot [readme]
Lucene impact index of the MS MARCO V2 passage corpus for uniCOIL. (Lucene 9)
msmarco-v2-passage-unicoil-noexp-0shot [readme]
Lucene impact index of the MS MARCO V2 passage corpus for uniCOIL (noexp). (Lucene 9)
msmarco-v2-doc-segmented-unicoil-0shot [readme]
Lucene impact index of the MS MARCO V2 segmented document corpus for uniCOIL, with title prepended. (Lucene 9)
msmarco-v2-doc-segmented-unicoil-noexp-0shot [readme]
Lucene impact index of the MS MARCO V2 segmented document corpus for uniCOIL (noexp) with title prepended. (Lucene 9)
beir-v1.0.0-trec-covid-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): TREC-COVID encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-bioasq-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): BioASQ encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-nfcorpus-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): NFCorpus encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-nq-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): NQ encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-hotpotqa-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): HotpotQA encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-fiqa-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): FiQA-2018 encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-signal1m-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): Signal-1M encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-trec-news-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): TREC-NEWS encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-robust04-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): Robust04 encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-arguana-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): ArguAna encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-webis-touche2020-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): Webis-Touche2020 encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-android-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-android encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-english-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-english encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-gaming-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-gaming encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-gis-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-gis encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-mathematica-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-mathematica encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-physics-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-physics encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-programmers-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-programmers encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-stats-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-stats encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-tex-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-tex encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-unix-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-unix encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-webmasters-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-webmasters encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-cqadupstack-wordpress-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): CQADupStack-wordpress encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-quora-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): Quora encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-dbpedia-entity-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): DBPedia encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-scidocs-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): SCIDOCS encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-fever-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): FEVER encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-climate-fever-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): Climate-FEVER encoded by SPLADE-distill CoCodenser-medium
beir-v1.0.0-scifact-splade_distil_cocodenser_medium [readme]
Lucene impact index of BEIR (v1.0.0): SciFact encoded by SPLADE-distill CoCodenser-medium

Faiss Indexes

msmarco-v1-passage.aggretriever-cocondenser
Faiss FlatIP index of the MS MARCO passage corpus encoded by aggretriever-cocondenser encoder.
msmarco-v1-passage.aggretriever-distilbert
Faiss FlatIP index of the MS MARCO passage corpus encoded by aggretriever-distilbert encoder.
msmarco-v1-passage.ance
Faiss FlatIP index of the MS MARCO passage corpus encoded by the ANCE MS MARCO passage encoder
msmarco-v1-passage.distilbert-dot-margin-mse-t2
Faiss FlatIP index of the MS MARCO passage corpus encoded by the distilbert-dot-margin_mse-T2-msmarco encoder
msmarco-v1-passage.distilbert-dot-tas_b-b256
Faiss FlatIP index of the MS MARCO passage corpus encoded by distilbert-dot-tas_b-b256-msmarco encoder
msmarco-v1-passage.sbert
Faiss FlatIP index of the MS MARCO passage corpus encoded by the SBERT MS MARCO passage encoder
msmarco-v1-passage.tct_colbert
Faiss FlatIP index of the MS MARCO passage corpus encoded by TCT-ColBERT
msmarco-v1-passage.tct_colbert.hnsw
Faiss HNSW index of the MS MARCO passage corpus encoded by TCT-ColBERT
msmarco-v1-passage.tct_colbert-v2
Faiss FlatIP index of the MS MARCO passage corpus encoded by the tct_colbert-v2 passage encoder
msmarco-v1-passage.tct_colbert-v2-hn
Faiss FlatIP index of the MS MARCO passage corpus encoded by the tct_colbert-v2-hn passage encoder
msmarco-v1-passage.tct_colbert-v2-hnp
Faiss FlatIP index of the MS MARCO passage corpus encoded by the tct_colbert-v2-hnp passage encoder
msmarco-v1-passage.openai-ada2
Faiss FlatIP index of the MS MARCO document corpus encoded by TCT-ColBERT-V2-HNP
msmarco-v1-doc.ance-maxp
Faiss FlatIP index of the MS MARCO document corpus encoded by the ANCE MaxP encoder
msmarco-v1-doc.tct_colbert
Faiss FlatIP index of the MS MARCO document corpus encoded by TCT-ColBERT
msmarco-v1-doc-segmented.tct_colbert-v2-hnp
Faiss FlatIP index of the MS MARCO document corpus encoded by TCT-ColBERT-V2-HNP
beir-v1.0.0-trec-covid.contriever [readme]
Faiss index for BEIR v1.0.0 (TREC-COVID) corpus encoded by Contriever encoder.
beir-v1.0.0-bioasq.contriever [readme]
Faiss index for BEIR v1.0.0 (BioASQ) corpus encoded by Contriever encoder.
beir-v1.0.0-nfcorpus.contriever [readme]
Faiss index for BEIR v1.0.0 (NFCorpus) corpus encoded by Contriever encoder.
beir-v1.0.0-nq.contriever [readme]
Faiss index for BEIR v1.0.0 (NQ) corpus encoded by Contriever encoder.
beir-v1.0.0-hotpotqa.contriever [readme]
Faiss index for BEIR v1.0.0 (HotpotQA) corpus encoded by Contriever encoder.
beir-v1.0.0-fiqa.contriever [readme]
Faiss index for BEIR v1.0.0 (FiQA-2018) corpus encoded by Contriever encoder.
beir-v1.0.0-signal1m.contriever [readme]
Faiss index for BEIR v1.0.0 (Signal-1M) corpus encoded by Contriever encoder.
beir-v1.0.0-trec-news.contriever [readme]
Faiss index for BEIR v1.0.0 (TREC-NEWS) corpus encoded by Contriever encoder.
beir-v1.0.0-robust04.contriever [readme]
Faiss index for BEIR v1.0.0 (Robust04) corpus encoded by Contriever encoder.
beir-v1.0.0-arguana.contriever [readme]
Faiss index for BEIR v1.0.0 (ArguAna) corpus encoded by Contriever encoder.
beir-v1.0.0-webis-touche2020.contriever [readme]
Faiss index for BEIR v1.0.0 (Webis-Touche2020) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-android.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-android) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-english.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-english) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-gaming.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-gaming) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-gis.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-gis) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-mathematica.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-mathematica) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-physics.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-physics) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-programmers.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-programmers) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-stats.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-stats) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-tex.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-tex) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-unix.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-unix) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-webmasters.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-webmasters) corpus encoded by Contriever encoder.
beir-v1.0.0-cqadupstack-wordpress.contriever [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-wordpress) corpus encoded by Contriever encoder.
beir-v1.0.0-quora.contriever [readme]
Faiss index for BEIR v1.0.0 (Quora) corpus encoded by Contriever encoder.
beir-v1.0.0-dbpedia-entity.contriever [readme]
Faiss index for BEIR v1.0.0 (DBPedia) corpus encoded by Contriever encoder.
beir-v1.0.0-scidocs.contriever [readme]
Faiss index for BEIR v1.0.0 (SCIDOCS) corpus encoded by Contriever encoder.
beir-v1.0.0-fever.contriever [readme]
Faiss index for BEIR v1.0.0 (FEVER) corpus encoded by Contriever encoder.
beir-v1.0.0-climate-fever.contriever [readme]
Faiss index for BEIR v1.0.0 (Climate-FEVER) corpus encoded by Contriever encoder.
beir-v1.0.0-scifact.contriever [readme]
Faiss index for BEIR v1.0.0 (SciFact) corpus encoded by Contriever encoder.
beir-v1.0.0-trec-covid.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (TREC-COVID) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-bioasq.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (BioASQ) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-nfcorpus.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (NFCorpus) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-nq.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (NQ) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-hotpotqa.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (HotpotQA) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-fiqa.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (FiQA-2018) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-signal1m.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (Signal-1M) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-trec-news.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (TREC-NEWS) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-robust04.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (Robust04) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-arguana.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (ArguAna) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-webis-touche2020.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (Webis-Touche2020) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-android.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-android) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-english.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-english) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-gaming.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-gaming) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-gis.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-gis) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-mathematica.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-mathematica) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-physics.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-physics) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-programmers.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-programmers) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-stats.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-stats) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-tex.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-tex) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-unix.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-unix) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-webmasters.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-webmasters) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-cqadupstack-wordpress.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (CQADupStack-wordpress) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-quora.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (Quora) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-dbpedia-entity.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (DBPedia) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-scidocs.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (SCIDOCS) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-fever.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (FEVER) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-climate-fever.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (Climate-FEVER) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
beir-v1.0.0-scifact.contriever-msmarco [readme]
Faiss index for BEIR v1.0.0 (SciFact) corpus encoded by Contriever encoder that has been fine-tuned with MS MARCO passage.
mrtydi-v1.1-arabic-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-bengali-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-english-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-finnish-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-indonesian-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-japanese-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-korean-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-russian-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-swahili-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-telugu-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-thai-mdpr-nq [readme]
Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-english-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-korean-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-russian-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-thai-mdpr-tied-pft-msmarco [readme]
Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
mrtydi-v1.1-arabic-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-bengali-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-english-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-finnish-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-indonesian-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-japanese-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-korean-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-russian-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-swahili-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-telugu-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-thai-mdpr-tied-pft-nq [readme]
Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-arabic-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-bengali-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-english-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-finnish-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-indonesian-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-japanese-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-korean-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-russian-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-swahili-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-telugu-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
mrtydi-v1.1-thai-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for Mr.TyDi v1.1 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on NQ.
miracl-v1.0-ar-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-bn-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-en-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-es-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Spanish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fa-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Persian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fi-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fr-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (French) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-hi-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Hindi) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-id-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ja-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ko-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ru-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-sw-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-te-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-th-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-zh-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Chinese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-de-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (German) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-yo-mdpr-tied-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Yoruba) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Spanish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Persian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (French) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Hindi) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Chinese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-de-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Chinese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-yo-mdpr-tied-pft-msmarco-ft-all [readme]
Faiss index for MIRACL v1.0 (Chinese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ar-mdpr-tied-pft-msmarco-ft-miracl-ar [readme]
Faiss index for MIRACL v1.0 (Arabic) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-bn-mdpr-tied-pft-msmarco-ft-miracl-bn [readme]
Faiss index for MIRACL v1.0 (Bengali) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-en-mdpr-tied-pft-msmarco-ft-miracl-en [readme]
Faiss index for MIRACL v1.0 (English) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-es-mdpr-tied-pft-msmarco-ft-miracl-es [readme]
Faiss index for MIRACL v1.0 (Spanish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-fa-mdpr-tied-pft-msmarco-ft-miracl-fa [readme]
Faiss index for MIRACL v1.0 (Persian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-fi-mdpr-tied-pft-msmarco-ft-miracl-fi [readme]
Faiss index for MIRACL v1.0 (Finnish) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-fr-mdpr-tied-pft-msmarco-ft-miracl-fr [readme]
Faiss index for MIRACL v1.0 (French) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-hi-mdpr-tied-pft-msmarco-ft-miracl-hi [readme]
Faiss index for MIRACL v1.0 (Hindi) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-id-mdpr-tied-pft-msmarco-ft-miracl-id [readme]
Faiss index for MIRACL v1.0 (Indonesian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-ja-mdpr-tied-pft-msmarco-ft-miracl-ja [readme]
Faiss index for MIRACL v1.0 (Japanese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-ko-mdpr-tied-pft-msmarco-ft-miracl-ko [readme]
Faiss index for MIRACL v1.0 (Korean) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-ru-mdpr-tied-pft-msmarco-ft-miracl-ru [readme]
Faiss index for MIRACL v1.0 (Russian) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-sw-mdpr-tied-pft-msmarco-ft-miracl-sw [readme]
Faiss index for MIRACL v1.0 (Swahili) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-te-mdpr-tied-pft-msmarco-ft-miracl-te [readme]
Faiss index for MIRACL v1.0 (Telugu) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-th-mdpr-tied-pft-msmarco-ft-miracl-th [readme]
Faiss index for MIRACL v1.0 (Thai) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-zh-mdpr-tied-pft-msmarco-ft-miracl-zh [readme]
Faiss index for MIRACL v1.0 (Chinese) corpus encoded by mDPR passage encoder pre-fine-tuned on MS MARCO, then fine-tuned in-language with MIRACL.
miracl-v1.0-ar-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Arabic) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-bn-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Bengali) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-en-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (English) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-es-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Spanish) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fa-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Persian) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fi-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Finnish) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-fr-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (French) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-hi-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Hindi) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-id-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Indonesian) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ja-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Japanese) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ko-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Korean) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-ru-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Russian) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-sw-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Swahili) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-te-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Telugu) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-th-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Thai) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-zh-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Chinese) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-de-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (German) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
miracl-v1.0-yo-mcontriever-pft-msmarco [readme]
Faiss index for MIRACL v1.0 (Yoruba) corpus encoded by mContriever passage encoder pre-fine-tuned on MS MARCO.
wikipedia-dpr-100w.dpr-multi
Faiss FlatIP index of Wikipedia encoded by the DPR doc encoder trained on multiple QA datasets
wikipedia-dpr-100w.dpr-single-nq
Faiss FlatIP index of Wikipedia encoded by the DPR doc encoder trained on NQ
wikipedia-dpr-100w.bpr-single-nq
Faiss binary index of Wikipedia encoded by the BPR doc encoder trained on NQ
wikipedia-dpr-100w.ance-multi
Faiss FlatIP index of Wikipedia encoded by the ANCE-multi encoder
wikipedia-dpr-100w.dkrr-nq
Faiss FlatIP index of Wikipedia DPR encoded by the retriever model from 'Distilling Knowledge from Reader to Retriever for Question Answering' trained on NQ
wikipedia-dpr-100w.dkrr-tqa
Faiss FlatIP index of Wikipedia DPR encoded by the retriever model from 'Distilling Knowledge from Reader to Retriever for Question Answering' trained on TriviaQA
wiki-all-6-3.dpr2-multi-retriever [readme]
Faiss FlatIP index of wiki-all-6-3-tamber encoded by a 2nd iteration DPR model trained on multiple QA datasets
cast2019-tct_colbert-v2.hnsw [readme]
Faiss HNSW index of the CAsT2019 passage corpus encoded by the tct_colbert-v2 passage encoder