參與人員:
- 余艾鍶、2. 程會林、3. 黃莉婷、4. 梁清源、5. 曾偉、6. 陳南浩
完成檢查:博客(讀書筆記)、課后習題答案、代碼、回答問題
《Text Mining and Analytics》(12.13)
https://www.coursera.org/learn/text-mining
Week1:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What does a computer have to do in order to understand a natural language sentence?
- What is ambiguity?
- Why is natural language processing (NLP) difficult for computers?
- What is bag-of-words representation?
- Why is this word-based representation more robust than representations derived from syntactic and semantic analysis of text?
- What is a paradigmatic relation?
- What is a syntagmatic relation?
- What is the general idea for discovering paradigmatic relations from text?
- What is the general idea for discovering syntagmatic relations from text?
- Why do we want to do Term Frequency Transformation when computing similarity of context?
- How does BM25 Term Frequency transformation work?
- Why do we want to do Inverse Document Frequency (IDF) weighting when computing similarity of context?
未完成:
已完成:
黃莉婷
http://blog.csdn.net/weixin_40962955/article/details/78828721
梁清源
http://blog.csdn.net/qq_33414271/article/details/78802272
http://www.lxweimin.com/u/337e85e2a284
曾偉
http://www.lxweimin.com/p/9e520d5ccdaa
程會林
http://blog.csdn.net/qq_35159009/article/details/78836340
余艾鍶
http://blog.csdn.net/xy773545778/article/details/78829053
陳南浩
http://blog.csdn.net/DranGoo/article/details/78850788
Week2:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What is entropy? For what kind of random variables does the entropy function reach its minimum and maximum, respectively? 1
- What is conditional entropy? 2
- What is the relation between conditional entropy H(X|Y) and entropy H(X)? Which is larger? 3
- How can conditional entropy be used for discovering syntagmatic relations? 4
- What is mutual information I(X;Y)? How is it related to entropy H(X) and conditional entropy H(X|Y)? 5
- What’s the minimum value of I(X;Y)? Is it symmetric? 6
- For what kind of X and Y, does mutual information I(X;Y) reach its minimum? For a given X, for what Y does I(X;Y) reach its maximum? 1
- Why is mutual information sometimes more useful for discovering syntagmatic relations than conditional entropy?
What is a topic? 2 - How can we define the task of topic mining and analysis computationally? What’s the input? What’s the output? 3
- How can we heuristically solve the problem of topic mining and analysis by treating a term as a topic? What are the main problems of such an approach? 4
- What are the benefits of representing a topic by a word distribution? 5
- What is a statistical language model? What is a unigram language model? How can we compute the probability of a sequence of words given a unigram language model? 6
- What is Maximum Likelihood estimate of a unigram language model given a text article? 1
- What is the basic idea of Bayesian estimation? What is a prior distribution? What is a posterior distribution? How are they related with each other? What is Bayes rule? 2
未完成:陳南浩
已完成:
梁清源
http://blog.csdn.net/qq_33414271/article/details/78871154
程會林
http://www.lxweimin.com/p/61614d406b0f
黃莉婷
http://blog.csdn.net/weixin_40962955/article/details/78877103
余艾鍶
http://blog.csdn.net/xy773545778/article/details/78848613
曾偉
http://blog.csdn.net/qq_39759159/article/details/78882651
Week3:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What is a mixture model? In general, how do you compute the probability of observing a particular word from a mixture model? What is the general form of the expression for this probability? 3
- What does the maximum likelihood estimate of the component word distributions of a mixture model behave like? In what sense do they “collaborate” and/or “compete”? 4
- Why can we use a fixed background word distribution to force a discovered topic word distribution to reduce its probability on the common (often non-content) words? 5
- What is the basic idea of the EM algorithm? What does the E-step typically do? What does the M-step typically do? In which of the two steps do we typically apply the Bayes rule? Does EM converge to a global maximum? 6
- What is PLSA? How many parameters does a PLSA model have? How is this number affected by the size of our data set to be mined? How can we adjust the standard PLSA to incorporate a prior on a topic word distribution? 1
- How is LDA different from PLSA? What is shared by the two models? 2
未完成:余艾鍶
已完成:
程會林:公式歸一化為什么不同?
http://www.lxweimin.com/p/bcef1ad7a530?utm_campaign=haruki&utm_content=note&utm_medium=reader_share&utm_source=qq
曾偉
http://www.cnblogs.com/Negan-ZW/p/8179076.html
梁清源
http://blog.csdn.net/qq_33414271/article/details/78938301
黃莉婷 LDA 的原理
http://blog.csdn.net/weixin_40962955/article/details/78941383#t10
陳南浩
http://blog.csdn.net/DranGoo/article/details/78968749
Week4:
Guiding Questions
Develop your answers to the following guiding questions while watching the video lectures throughout the week.
- What is clustering? What are some applications of clustering in text mining and analysis? 3
- How can we use a mixture model to do document clustering? How many parameters are there in such a model? 4
- How is the mixture model for document clustering related to a topic model such as PLSA? In what way are they similar? Where are they different? 5
- How do we determine the cluster for each document after estimating all the parameters of a mixture model? 6
- How does hierarchical agglomerative clustering work? How do single-link, complete-link, and average-link work for computing group similarity? Which of these three ways of computing group similarity is least sensitive to outliers in the data? 1
- How do we evaluate clustering results? 2
- What is text categorization? What are some applications of text categorization? 3
- What does the training data for categorization look like?
- How does the Na?ve Bayes classifier work? 4
- Why do we often use logarithm in the scoring function for Na?ve Bayes? 5
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
Week5:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
Week6:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
《Text Retrieval and Search Engines》(12.13)
https://www.coursera.org/learn/text-retrieval
Week1:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
Week2:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
Week3:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
Week4:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
Week5:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成:
Week6:
未完成:余艾鍶、程會林、黃莉婷、梁清源、曾偉、陳南浩
已完成: