Wikipedia Corpus

Here is a direct link for English Corpus containing all the articles from 2014 data dump.

Top 65536 words with ID’s and frequencies

The file contains comma separated values in the form id,word,frequency. The frequency is not normalized. It is recorded as an integer to represent the number of occurrences in the cleaned version of the Wikipedia. Since the number of words is just 65536 it is easy to load and normalize this information on the fly.

Highest frequency article for each word

This file contains encoded articles for each of the 65536 words. It is encoded as a list of comma separated IDs that represent words. IDs correspond to the word-list linked above. Articles are sorted in descending order by the word id. Each article has been chosen by maximizing the word frequency of a particular word the article represents. Each line is written in the following form:

word-ID,1st-word-ID,2nd-word-ID, ... ,nth-word-ID\n


A Theory of How Columns in the Neocortex Enable Learning the Structure of the World

The neocortex is complex. Within its 2.5 mm thickness are dozens of cell types, numerous layers, and intricate connectivity patterns. The connections between cells suggest a columnar flow of information across layers as well as a laminar flow within some layers. Fortunately, this complex circuitry is remarkably preserved in all regions, suggesting that a canonical circuit consisting of columns and layers underlies everything the neocortex does. Understanding the function of the canonical circuit is a key goal of neuroscience.

Hierarchical Temporal Memory (HTM) Whitepaper

At the heart of Hierarchical Temporal Memory (HTM), our machine intelligence technology, are time-based learning algorithms that store and recall spatial and temporal patterns. This paper describes how the learning algorithms work and their biological mapping.
PDF Download

Semantic Folding White Paper PDF

The two faculties - making analogies and making predictions based on previous experiences - seem to be essential and could even be sufficient for the emergence of human-like intelligence.