- Making Recommendations (Collaborative Filtering)
- User-based
- Finding similar users
- User as vector based on item score
- Euclidean distance
- Pearson correlation
- Reverse users and items, we can find similar items to a given item
- Sort and recommend items based on
- sum(user similarity * user’s item score) for each other user
- Item-based
- Find item similarities
- These results can be cached and periodically updated
- Sort and recommend items based on
- sum((item similarity * user’s item score) / sum(item similarity)) for each user’s item
- Significantly faster and better for sparse dataset
- Discovering Groups (Clustering)
- Supervised Learning
- use example inputs and outputs
- neural networks, decision trees, support-vector machines, and Bayesian filtering
- Word Vectors of texts
- Hierarchical Clustering
- choose two nearest vectors to combine
- results in binary tree
- Can cluster articles or words
- transpose the matrix
- Dendrogram drawing
- K-Means clustering
- randomly place k centroids
- assign every item to the nearest centroid, and move the centroid to the average location of all items assigned to them
- Searching and Ranking
- word index stored in relational database
- ranking
- content-based
- various metrics: word frequency, document location, word distance
- use inbound links
- simple count
- PageRank algorithm
- random walk
- sparse matrix multiplication iterations
- use link text
- learning from clicks
- click-tracking neuro-network (multilayer perception network, i.e. MLP network)
- one hidden layer
- Optimization
- stochastic optimization
- numerical solution
- cost function
- random searching
- hill climbing
- increase the most promising dimension of a vector
- simulated annealing
- variable: temperature, starts very high and gradually gets lower
- worse solution being accepted depending on temperature
- generic algorithms
- mutate, crossover, …
- Document Filtering (to be expanded…)
- use words as features
- naive Bayesian classifier
- the Fisher method
- Modeling with Decision Trees
- Algorithm: CART (Classification and Regression Trees)
- choose the best split from all possible splits
- Gini impurity
- information entropy
- sum of p(x)log(p(x))
- recursively build the whole tree
- then can be used to classify new observations
- pruning the tree
- when it becomes overfitted
- checking pairs of nodes that have a common parent to see if merging them would increase the entropy by less than a specified threshold
- Dealing with
- missing data
- use both branches
- numerical outcomes
- use variance instead of entropy
- Building Price Models
- k-nearest neighbors (kNN)
- weighted
- may need scaling or normalizing
- to estimate the probability density
- cross-validation
- divide data into training sets and test sets
- Advanced Classification: Kernel Methods and SVMs
- basic linear classification
- using dot-products to determine distance
- kernel methods
- define another dot-product == move the points into different space
- support-vector machines
- find the line that is as far away as possible from classes
- Finding Independent Features
- non-negative matrix factorization
- factor the article-word matrix into two matrix
- the features matrix: row for features, column for words
- the weight matrix: row for articles, column for features
- Evolving Intelligence
- creating an algorithm that creating algorithms
- mutation, crossover/breeding
- use trees to represent algorithm to enable evolving
- use to guess numerical functions or, game AI
- Algorithm Summary
- Supervised Learning
- Bayesian Classifier
- Decision Tree Classifier
- Neural Networks
- Support-Vector Machines
- Unsupervised Learning
- k-Nearest Neighbors
- Clustering
- Multidimensional Scaling
- Non-Negative Matrix Factorization
- Optimization