-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathREADME.txt
More file actions
120 lines (83 loc) · 4.13 KB
/
README.txt
File metadata and controls
120 lines (83 loc) · 4.13 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
======================
First Implementation
======================
For this first pass at data analysis, articles and exchange rate data will
be downloaded on three currency pairs: USD<->AUD, USD<->GBP, GBP<->AUD.
Feature Extraction
==================
The HTML of the articles will first be preprocessed to exclude all items
not in a <p> tag, and then to remove all HTML markup.
The text will have the named entities identified by the NLTK and then the
proper nouns counted. Only nouns appearing three or more times will
be retained.
The resulting word lists will represent a sparse feature matrix, and
unique words will be assigned dimensions. The currency pairs each also
represent one of three dimensions.
An article is assumed to affect average of the opening and closing
exchange rates of a currency pair the trading day following the
publication of the article, and the fractional increase over the previous
day's opening and closing price is used as the value of the feature matrix
for the SVR.
==============
Feed Handler
==============
This package handles newsfeeds for a currency pair. It will check the feed
for new articles, and when found, it will pull the new articles and hand
them off to the feature extractor.
Package Dependencies
====================
This requires the `Universal Feed Parser`_, the `feed cache package`_, and
the `requests package`_.
.. _Universal Feed Parser: http://www.feedparser.org/
.. _feed cache package: http://www.doughellmann.com/articles/pythonmagazine/features/feedcache/
.. _requests package: http://pypi.python.org/pypi/requests
=================
Getting Started
=================
Creating the virtualenv
-----------------------
First, download or install `virtualenv`_. This package can be used to
create isolated Python environments for working on a project without
affecting or being affected by an existing system python. It still
requires that Python be installed, but after that it keeps mostly
to itself.
.. _virtualenv: http://pypi.python.org/pypi/virtualenv
Unpack that, and in the top directory there is a script called
virtualenv.py. We'll need to run that, but from the top of the londonriots
checkout
python /path/to/virtualenv.py --no-site-packages dev.env
This will create a virtual environment in the directory "dev.env", and we
can use that for all of the development work, including installing
packages and running the components of londonriots itself.
Installing NLTK
---------------
NLTK (Natural Language Toolkit) will be used to extract useful words from the feeds in order to help us find patterns. This is can be installed with:
./dev.env/bin/pip install http://nltk.googlecode.com/files/nltk-2.0.1rc1.tar.gz
Installing and setup of Postgres
--------------------------------
The easiest way to install Postgres is to use homebrew. You can find more information about it here: http://mxcl.github.com/homebrew/
Once, you have homebrew installed, Postgres installation is as easy as:
brew install postgres
After the installation is complete, you have to run the following commands:
initdb --username=postgres /usr/local/var/postgres
mkdir -p ~/Library/LaunchAgents
cp /usr/local/Cellar/postgresql/9.0.4/org.postgresql.postgres.plist ~/Library/LaunchAgents/
launchctl load -w ~/Library/LaunchAgents/org.postgresql.postgres.plist
createuser --createdb --encrypted --pwprompt --no-superuser --username=postgres --host=localhost devlondonriots
createdb --host=localhost --username=devlondonriots devlondonriots
If it doesn't already exist, create /etc/sysctl.conf and put the following values in that file:
kern.sysv.shmmax=64000000
kern.sysv.shmmin=1
kern.sysv.shmmni=256
kern.sysv.shmseg=64
kern.sysv.shmall=65536
These commands creates a database, a database user, an automatic launcher for postgres and sets the shared memory settings for postgres
Development Installation
------------------------
Next, we'll install the londonriots package in "development" mode, which
downloads and installs all of the external packages, and then adds
londonriots itself to the virtualenv for testing::
./dev.env/bin/python setup.py develop
Running The Tests
-----------------
./dev.en/bin/python setup.py test