在線語料庫(國內)
- 語料庫:http://yulk.org/
- BCC語料庫:http://bcc.blcu.edu.cn/
- 語料庫在線:http://www.cncorpus.org/
- 北京大學中國語言學研究中心:http://ccl.pku.edu.cn/corpus.asp
- 北外語料庫語言學:http://www.bfsu-corpus.org/
- 現代漢語平衡語料庫:http://www.sinica.edu.tw/SinicaCorpus/
- 古漢語語料庫:http://www.sinica.edu.tw/ftms-bin/ftmsw
- 近代漢語標記語料庫:http://www.sinica.edu.tw/Early_Mandarin/
- 樹圖數據庫:http://treebank.sinica.edu.tw/
- 搜文解字:http://words.sinica.edu.tw/
- 漢籍電子文獻:http://www.sinica.edu.tw/~tdbproj/handy1/
- 中國傳媒大學文本語料庫檢索系統:http://ling.cuc.edu.cn/RawPub/
- 哈工大信息檢索研究室對外共享語料庫資源:http://ir.hit.edu.cn/demo/ltp/Sharing_Plan.htm
- 香港教育學院語言資訊科學中心及其語料庫實驗室:http://www.livac.org/index.php?lang=sc
- 中文語言資源聯盟:http://www.chineseldc.org/
在線語料庫(國外)
- BNC——英國國家語料庫(British National Corpus):http://www.natcorp.ox.ac.uk/
- BOE——柯林斯英語語料庫(the Bank of English):http://www.collinslanguage.com/language-resources/dictionary-datasets/
- ANC——美國國家語料庫(American National Corpus):http://www.anc.org/
- 蘭開斯特漢語語料庫 (LCMC):http://ota.oucs.ox.ac.uk/scripts/download.php?otaid=2474
- SKETCH ENGINE多語言語料庫:http://www.sketchengine.co.uk
- BASE——英國學術口語語料庫(British Academic Spoken English Corpus):http://www2.warwick.ac.uk/fac/soc/celte/research/base/
- Lextutor:http://www.lextutor.ca/
- My Memory:https://mymemory.translated.net/
- TAUS:http://www.tausdata.org/index.php/language-search-engine
- TTMEM:https://www.ttmem.com/terminology/download-translation-memory/
- TinyTM:http://tinytm.sourceforge.net/
- DGT Translation Memory:https://magmatranslation.com/en/free-translation-memory/
- European Parliament Proceedings Parallel Corpus 1996-2011:http://statmt.org/europarl/
- University of Maryland Parallel Corpus Project: The Bible:http://users.umiacs.umd.edu/~resnik/parallel/bible.html
- Aligned Hansards of the 36th Parliament of Canada:https://www.isi.edu/natural-language/download/hansard/
- EU Publication Offices:https://publications.europa.eu/en/web/general-publications/publications
- Wikimedia Downloads:https://dumps.wikimedia.org/backup-index.html
- Open Subtitles:https://www.opensubtitles.org/en/search/subs
- United Nations Parallel Corpus:https://cms.unov.org/UNCorpus/
- European language pairs:http://www.statmt.org/wmt13/translation-task.html#download
- parallel corpus search:http://paralela.clarin-pl.eu/#
- UM-Corpus: A Large English-Chinese Parallel Corpus:http://nlp2ct.cis.umac.mo/um-corpus/um-corpus-license.html
- Clarin Parallel corpora:https://www.clarin.eu/resource-families/parallel-corpora
- The PKU 863 Chinese-English Parallel Corpus:https://www.lancaster.ac.uk/fass/projects/corpus/863parallel/
- 《紅樓夢》漢英平行語料庫:http://corpus.usx.edu.cn/hongloumeng/images/shiyongshuoming.htm
- 中央研究院近代漢語標記語料庫:http://lingcorpus.iis.sinica.edu.tw/early/
- BYU corpora: https://corpus.byu.edu/
其他子語料庫
- Books – A collection of translated literature
- DGT – A collection of EU Translation Memories provided by the JRC
- DOGC – Documents from the Catalan Goverment
- ECB – European Central Bank corpus
- EMEA – European Medicines Agency documents
- The EU bookshop corpus
- EUconst – The European constitution
- EUROPARL v7 – European Parliament Proceedings
- giga-fren – French-English Gigal-Word Corpus
- GNOME – GNOME localization files
- Global Voices – News stories in various languages
- The Croatian – English WaC corpus
- JRC-Acquis- legislative EU texts
- KDE4 – KDE4 localization files (v.2)
- KDEdoc – the KDE manual corpus
- MBS – Belgisch Staatsblad corpus
- memat – Xhosa/English parallel data
- MontenegrinSubs – Montenegrin movie subtitles
- MultiUN – Translated UN documents
- News Commentary, v9.0, v9.1
- OfisPublik – Breton – French parallel texts
- OO – the OpenOffice.org corpus
- OpenOffice.org 3 corpus
- OpenSubtitles – the opensubtitles.org corpus
- OpenSubtitles2011, OpenSubtitles2012, OpenSubtitles2013
- OpenSubtitles2016 – snapshot from 2016
- OpenSubtitles2018 – new complete version
- ParaCrawl corpus
- ParCor – A Parallel Pronoun-Coreference Corpus
- PHP – the PHP manual corpus
- Regeringsf?rklaringen – a tiny example corpus
- SETIMES – A parallel corpus of the Balkan languages
- SPC – Stockholm Parallel Corpora
- Tatoeba – A DB of translated sentences
- TedTalks hr-en
- TED Talks 2013
- Tanzil – A collection of Quran translations
- TEP – The Tehran English-Persian subtitle corpus
- Ubuntu – Ubuntu localization files
- UN – Translated UN documents
- Wikipedia – translated sentences from Wikipedia
- WikiSource – (small en-sv sample only
- WMT News Test Sets
- The Xhosa – English Navy corpus
在線術語庫
- 中國關鍵詞:http://www.china.org.cn/chinese/china_key_words/
- 中國特色話語對外翻譯標準化術語庫:http://210.72.20.108/index/index.jsp
- 中國核心詞匯:https://www.cnkeywords.net/index
- 中國思想文化術語:http://www.chinesethought.cn/TermBase.aspx
- 聯合國術語庫:https://unterm.un.org/UNTERM/portal/welcome
- 術語在線:http://termonline.cn/index.htm
- 國家教育研究院術語庫:http://terms.naer.edu.tw/download/
- 區塊鏈相關術語:http://8btc.com/thread-17286-16-1.html
- 明代職官中英辭典: https://escholarship.org/uc/item/2bz3v185
- 中國規范術語: http://shuyu.cnki.net/index.aspx
- Grand Dictionnaire Terminologique: http://www.granddictionnaire.com/
- TERMIUM: http://www.btb.termiumplus.gc.ca/tpv2alpha/alpha-eng.html?lang=eng
- 語帆術語寶:http://termbox.lingosail.com/
- 微軟術語庫:https://www.microsoft.com/zh-cn/language
- 世界衛生組織術語庫:http://www.who.int/substance_abuse/terminology/zh/
- 電子工程術語表:https://www.maximintegrated.com/cn/glossary/definitions.mvp/terms/all
- Mdict 100GB超大離線詞庫下載:https://downloads.freemdict.com/
- 一本詞典:http://www.onedict.com/
- 國家標準《物流術語》 :http://zizhan.mot.gov.cn/zhuantizhuanlan/gonglujiaotong/shoufeigongluzmk/zhengcefagui/201508/t20150814_1863913.html
- 冬奧會術語查詢網站:http://owgt.lingosail.com/
- 音樂術語查詢:http://dictionary.t-classical.com/
- European Union Language and terminology:https://europa.eu/european-union/documents-publications/language-and-terminology_en
- IATE (Interactive Terminology for Europe) EU’s terminology database:https://iate.europa.eu/home
- 香港法律中英術語:https://www.elegislation.gov.hk/glossary/chi
- Magic Search:http://magicsearch.org
- Microsoft Language Portal:https://www.microsoft.com/en-us/language
- Linguee:https://www.linguee.com/
- The Free Dictionary:http://www.thefreedictionary.com/
- Glosbe:https://glosbe.com/tmem/