satra.github.com/atom.xml at master · satra/satra.github.com · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[satrajit.ghosh]]></title>
  <link href="http://satra.cogitatum.org/atom.xml" rel="self"/>
  <link href="http://satra.cogitatum.org/"/>
  <updated>2016-03-10T20:17:40-05:00</updated>
  <id>http://satra.cogitatum.org/</id>
  <author>
    <name><![CDATA[Satrajit Ghosh]]></name>

  </author>
  <generator uri="http://octopress.org/">Octopress</generator>


  <entry>
    <title type="html"><![CDATA[Knowledge through provenance]]></title>
    <link href="http://satra.cogitatum.org/blog/2013/09/27/knowledge-through-provenance/"/>
    <updated>2013-09-27T14:14:00-04:00</updated>
    <id>http://satra.cogitatum.org/blog/2013/09/27/knowledge-through-provenance</id>
    <content type="html"><![CDATA[<p>At the recent Neuroinformatics congress in Stockholm, I gave a presentation on
representing certain types of knowledge by capturing structured provenance info
in metadata stores. Video and slides are now available.</p>

<!-- more -->


<h2>Video</h2>

<div class="embed-video-container"><iframe src="http://www.youtube.com/embed/w1A8BvJuN3s "></iframe></div>


<h2>Slides</h2>

<script async class="speakerdeck-embed" data-id="8dfccc40f22301304d2956f02c133fac" src="http://satra.cogitatum.org//speakerdeck.com/assets/embed.js"></script>


]]></content>
  </entry>

  <entry>
    <title type="html"><![CDATA[Response to NIH NOT-HG-13-014]]></title>
    <link href="http://satra.cogitatum.org/blog/2013/09/27/response-to-nih-not-hg-13-014/"/>
    <updated>2013-09-27T14:03:00-04:00</updated>
    <id>http://satra.cogitatum.org/blog/2013/09/27/response-to-nih-not-hg-13-014</id>
    <content type="html"><![CDATA[<p>This is a response to the NIH request for information on data provenance and
data wrangling. <a href="http://grants.nih.gov/grants/guide/notice-files/NOT-HG-13-014.html">NOT-HG-13-014</a>.</p>

<!-- more -->


<p>Knowledge can be represented as a collection of claims, assertions and facts
linked through a graph of provenance. Typically the elements of the knowledge
graph are derived from experimental observations, analyses, simulations and
theoretical advances. However, in much of science, the information supporting
such assertions is unavailable or provided in a form that requires significant
human intervention. This reduces trust in these claims and greatly increases the
burden on researchers to uncover provenance and query across data sources.</p>

<p>The rapid proliferation of data and computation requires standards and platforms
that will prevent users from being trapped in data silos and from bearing the
costs of creating and maintaining software for multiple databases and multiple
data models. It is imperative that scientists embrace the increase in data to
promote greater reproducibility of results and more robust testing of methods.
Tools that require minimal resources on the part of the investigator and
simplify distribution of data or computational services are essential and
lacking.</p>

<p>A fundamental challenge in many biomedical fields is to integrate data across
species, spatial scales (nanometers to inches), temporal scales (microseconds
to years), instrumentation (e.g., electron microscopy, magnetic resonance
imaging) and disorders (e.g., autism, schizophrenia). However, datasets contain
ad hoc metadata and are processed with methods specific to the sub-domain,
further limiting integration. Of immediate concern is a lack of tools for
accessing and reusing data and computational processes and a lack of established
standards for datasharing.</p>

<p>This response focuses on two of the areas of the RFI: data provenance and
data wrangling.</p>

<h2>Data provenance</h2>

<p>Reproducibility is the cornerstone of scientific discovery and demonstrates
the validity of results [1, 2]. However, recent high-profile articles [3,
4] have emphasized the limited reproducibility in life sciences. The methods
section that most scientific publications rely on to describe data
acquisition and analysis details might provide a summary,
but cannot capture the details necessary to reproduce all the methods,
nor can they provide the reviewer with sufficient information for
verification. Therefore, most publications are reviewed on the basis of
perceived impact rather than veracity. Scientists have proposed supplementing
publications with data, code and virtual machines. However,
most journals do not have the capacity to distribute these without incurring
additional costs. Scientists often resort to compressed files on personal
websites that have no guarantee of persistence. Furthermore,
these files are typically not described in any machine-useful format,
thereby requiring human translation and without formal provenance
introduces potential sources of error.</p>

<p>Provenance needs to be captured in a standardized data model. A data model is
an abstract conceptual formulation of information that explicitly determines
the structure of data and allows software and people to communicate and
interpret data precisely (source: Wikipedia). Provenance is information
about entities, activities, and people involved in producing a piece of
data or a thing, which can be used to form assessments about its quality,
reliability or trustworthiness (Source: W3C Prov working group).</p>

<p>The provenance data-model [5, PROV-DM] forms the basis for the W3C provenance
effort, currently a W3C recommendation. PROV-DM is organized in six
components, respectively dealing with: (1) entities and activities and the
time at which they were created, used, or ended; (2) derivations of entities
from entities; (3) agents bearing responsibility for entities that were
generated and activities that happened; (4) a notion of bundle,
a mechanism to support provenance of provenance; (5) properties to link
entities that refer to the same thing; and (6) collections forming a
logical structure for its members. Furthermore, the PROV-DM concept of a
bundle captures the notion of a Research Object [6] (a shareable collection
that integrates data and processes) while still allowing for the ongoing
evolution of the individual constituent elements.</p>

<p>In the neuroimaging domain, we have adopted this model as part of our efforts
on establishing standards for data sharing [7]. The components include (1) a
structured terminology that provides semantic context to data,
(2) a formal data model for neuroimaging with robust tracking of data
provenance, and (3) a provenance library with neuroimaging extensions that
can be used for the extraction of provenance data by image analysts and
imaging software developers. Furthermore, such information needs to be
represented at every stage of a research process where data are being
acquired or transformed through analysis. An example of provenance flow in
brain imaging is shown in the following figure.</p>

<p><img src="http://satra.cogitatum.org/assets/EDC.png" width="700"></p>

<p>Figure 1: The life cycle of brain imaging [8].</p>

<p>In order for this to be useful, all captured information needs to be stored
and shared in a standard, queriable format. Applications can then repurpose
the necessary components to interact with users.</p>

<h2>Data wrangling</h2>

<p>Sharing data is not enough. For instance, in order to tackle the problems of
health care, data needs to be linked through rich, machine-useful metadata.
However, data sharing efforts in brain imaging or other branches of science
typically involve one of two options: 1) a zipped file on the Web with a page
describing what is in the file or 2) a database, often with a Web interface,
specialized for distributing such data. However, using these databases
still require humans to understand and interpret data structures and
datatypes. This necessitates <strong>significant effort</strong> on the part of an
application developer and also limits the ability to query across extensive
resources, effectively creating silos.</p>

<p>The technology behind solving these problems grows out of the Semantic Web
vision of machines publishing data on the Web for other machines to consume.
The W3C helped standardize this idea as the Resource Description Framework
[9, RDF] in the late 1990s and have been building on it since then,
with various components becoming standards, adopted at various scales and for
various applications around the world. In 2006, Berners-Lee introduced a
refinement called Linked Data [10], which has become particularly popular
for open government data and biomedical research.</p>

<p>A key goal of any data wrangling software solution would be to add targeted
elements to the Linked Data technology stack. For example,
adding &#8220;data registration&#8221; will allow systems to quickly find data made
available by other systems, without the limitations imposed by a centralized
search engine. It is a practical solution, based on RDF Linked Data,
to a set of long-standing problems in the distributed database field. In any
such solution, there are four critical areas that would need to be addressed:
vocabulary management, query, integration with exist databases and data
resources, and data access control.</p>

<p><strong>Vocabulary management:</strong> One of the essential elements of the Semantic Web
vision is that vocabularies can be shared as easily as Web pages. In
principle, the technical design supports this, since RDF vocabulary terms
(the identifiers for classes, properties, and individual entities – the
elements which provide structure to data) are Web addresses.    In an ideal
world, the Web content for each term would include formal documentation,
tutorials, user discussion, links to software, links to examples of use,
and even links to real data sources using that term. In practice, however,
the Web content for vocabularies often falls far short of this goal. When
people encounter a new RDF vocabulary term, they may have to resort to Web
searches and asking colleagues what they know about it. When people are
creating new vocabularies, while they often have the best intentions,
they rarely have the resources, at that moment, to create a well-designed and
helpful site, or the motivation or fortitude to maintain it. As a result,
there is a chilling effect, with people afraid to use even high-quality
vocabularies for fear they will fall into disrepair.</p>

<p><strong>Semantic Web Query:</strong> For most purposes, the value of data is realized when
it is queried and connected to other data. The Semantic Web uses a popular
query and update language called SPARQL [11]. SPARQL engines can integrate
disparate sources of data, perform complex pattern matches against
integrated data, and report tabular results or perform database
manipulations based on those results. One important feature of SPARQL is
Service Federation [12, 13]. Keywords in the language permit a SPARQL query
to invoke other queries, unifying multiple remote query services in one
query. This effectively transforms a &#8220;mash-up&#8221; app into a single,
declarative query. The SPARQL Protocol provides uniform access to all SPARQL
services. This permits systems to work with any remote SPARQL service
without requiring drivers for any particular implementation.</p>

<p><strong>Integrating Relational Databases (R2RML):</strong> The Semantic Web offers data
integration at the Web scale without requiring all data to be natively stored
in RDF. The R2RML [14] language defines a mapping of relational data into
RDF. Existing tools (D2RQ [15], Revelytix Spyder) translate queries over
this virtual RDF store into SQL queries which are then executed over the
native store. Another approach is to use the Direct Mapping of Relational
Data to RDF and use Semantic Web rules to transform the terms and structure
into a shared ontology. The vast majority of scientific data are stored in
relational databases, curated, validated and maintained by existing
toolchains, with no interoperability across them.</p>

<p><strong>Data Access Control:</strong> With biomedical data there is an explicit
requirement of access control, especially when the underlying data cannot be
deidentified. Deidentification requirements vary significantly across
institutions and countries and cannot always be used to protect data.
Accidental exposure of private data can have personal and group impact with
social and/or financial consequences. Thus it is important that a data
sharing architecture address access control. In the Semantic Web context,
this will require per-document, per-resource, and per-triple authentication
and authorization and will be guided by ongoing work on accountability in
decentralized systems [16, 17]. In many instances, the metadata could be
curated openly with only patient data served by databases or Web servers
requiring traditional OAuth or OpenID based access.</p>

<h2>Conclusion</h2>

<p>Capturing science requires that information pertaining to all aspects of a
research activity are captured and represented richly. However,
most scientific domains, including biomedical sciences,
only capture pieces of information that are deemed relevant. Current
technologies exist that deliver the components necessary to capture and
create this information-rich landscape in order to create platforms for
knowledge exploration. In particular, using a technology agnostic data
provenance model as the core representation and Semantic Web technologies
that leverage such a representation, can provide solutions to both data
provenance and data wrangling questions.</p>

<h2>References</h2>

<p>[1] John PA Ioannidis. “Why most published research findings are false.” In: PLoS medicine 2.8 (2005), e124.</p>

<p>[2] Ramal Moonesinghe, Muin J Khoury, and A Cecile JW Janssens. “Most published research findings are falsebut a little replication goes a long way.” In: PLoS medicine 4.2 (2007), e28.</p>

<p>[3] C Glenn Begley and Lee M Ellis. “Drug development: Raise standards for preclinical cancer research.” In: Nature 483.7391 (2012), pp. 531–533.</p>

<p>[4] Harold Pashler and EricJan Wagenmakers. “Editors Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence?” In: Perspectives on Psycho- logical Science 7.6 (2012), pp. 528–530. doi: 10.1177/1745691612465253. eprint: http: //pps.sagepub.com/content/7/6/528.full.pdf+html. url: http://pps.sagepub.com/ content/7/6/528.short.</p>

<p>[5] Y Gil, S Miles, K Belhajjame, H Deus, D Garijo, G Klyne, P Missier, S Soiland-Reyes, and S Zednik. “PROV model primer.” In: W3C Working Draft (2012). url: http://www.w3. org/TR/2012/WD-prov-primer-20121211/.</p>

<p>[6] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, and Carole Goble. “Why linked data is not enough for scientists.” In: Future Generation Computer Systems 29.2 (2013). Special section: Recent advances in e-Science, pp. 599 –611. issn: 0167-739X. doi: 10.1016/j. future.2011.08.004. url: http://www.sciencedirect.com/science/article/pii/ S0167739X11001439.</p>

<p>[7] DB Keator, K. Helmer, J. Steffener, JA Turner, TGM Van Erp, S. Gadde, N. Ashish, GA Burns, BN Nichols, and SS Ghosh. “Towards structured sharing of raw and derived neu- roimaging data across existing resources.” In: arXiv preprint arXiv:1209.5922 (2012).</p>

<p>[8] Jean-Baptiste Poline, Janis L Breeze, Satrajit S Ghosh, Krzysztof Gorgolewski, Yaroslav O Halchenko, Michael Hanke, Karl G Helmer, Daniel S Marcus, Russell A Poldrack, Yannick Schwartz, John Ashburner, and David N Kennedy. “Data sharing in neuroimaging research.” In: Frontiers in Neuroinformatics 6.9 (2012). issn: 1662-5196. doi: 10.3389/fninf.2012. 00009. url: http://www.frontiersin.org/neuroinformatics/10.3389/fninf.2012. 00009/abstract.</p>

<p>[9] Ora Lassila and Ralph R Swick. “Resource description framework (RDF) model and syntax specification.” In: (1998).</p>

<p>[10] Tim Berners-Lee. “Design Issues: Linked Data.” In: (2006). url: http : / / www . w3 . org / DesignIssues/LinkedData.html.</p>

<p>[11] Eric PrudHommeaux and Andy Seaborne. “SPARQL query language for RDF.” In: W3C recommendation 15 (2008).</p>

<p>[12] Bastian Quilitz and Ulf Leser. “Querying distributed RDF data sources with SPARQL.” In: The Semantic Web: Research and Applications (2008), pp. 524–538.</p>

<p>[13] Carlos Buil-Aranda, Marcelo Arenas, and Oscar Corcho. “Semantics and optimization of the SPARQL 1.1 federation extension.” In: The Semanic Web: Research and Applications (2011), pp. 1–15.</p>

<p>[14] Souripriya Das, Seema Sundara, and Richard Cyganiak. “{R2RML: RDB to RDF Mapping Language}.” In: (2012).</p>

<p>[15] Christian Bizer and Andy Seaborne. “D2RQ-treating non-RDF databases as virtual RDF graphs.” In: Proceedings of the 3rd International Semantic Web Conference (ISWC2004). 2004, p. 26.</p>

<p>[16] Daniel J Weitzner, Harold Abelson, Tim Berners-Lee, Chris Hanson, James Hendler, Lalana Kagal, Deborah L McGuinness, Gerald Jay Sussman, and K Krasnow Waterman. “Transpar- ent accountable data mining: New strategies for privacy protection.” In: (2006).</p>

<p>[17] Oshani Wasana Seneviratne. “Augmenting the web with accountability.” In: Proceedings of the 21st international conference companion on World Wide Web. ACM. 2012, pp. 185–190.</p>
]]></content>
  </entry>

  <entry>
    <title type="html"><![CDATA[Provenance and metadata]]></title>
    <link href="http://satra.cogitatum.org/blog/2013/02/20/provenance-and-metadata/"/>
    <updated>2013-02-20T10:54:00-05:00</updated>
    <id>http://satra.cogitatum.org/blog/2013/02/20/provenance-and-metadata</id>
    <content type="html"><![CDATA[<p>The standards for datasharing taskforce of the INCF was tasked with establishing
data models and standards for sharing brain imaging and associated phenotypic,
genotypic and other metadata. This post is a summary of the problem space and
some of the efforts I have been associated with in addressing these issues.</p>

<!-- more -->


<h5>the goals</h5>

<p>I think the taskforce always had in mind two goals: data sharing and standards
for data sharing. These goals are linked, but the former very much requires the
latter. There is the further dichotomy in this that we need a way to address
existing data stored in a variety of databases and new data streaming in.</p>

<h5>the obstacles</h5>

<p>The biggest problem with data sharing is the fact that we do not really have any
common standards - perhaps NIFTI-1 and MINC-2 serve as standards for brain
imaging file formats, but in my view addresses somewhat of a narrow slice.
Throughout the brain imaging world, we use terms that we have come to recognize
as meaning something, but without a formal definition even when two people use
the same term we don&#8217;t know if they meant the &#8220;same&#8221; thing. For example, does
&#8220;gradient vector&#8221; mean the same thing for every person and program dealing with
diffusion weighted imaging data?</p>

<p>And this has turned out to be the biggest roadblock on our way to setting
standards. After all, with increasing computational approaches to data sharing
and exploration, machines more than humans need to be able to process the data
and humans to be able to see slices of data in intuitive ways. Across software
packages, imaging database providers, such common terminology is missing.</p>

<p>In addition to not having common terminology, we do not have a standard
mechanism for capturing provenance of data. After all, data on their own do not
implicitly tell us about sources of variance related to the processing
components. Efforts towards this have come from the <a href="http://provenance.loni.ucla.edu/">LONI group</a>, but are
very specific to workflow components of brain imaging and have not, till date,
reconciled with efforts to standardize provenance across the web such as the
<a href="http://www.w3.org/TR/prov-dm/">W3C provenance model</a>. Provenance needs to be captured at every stage
of data generation, not just for brain image processing software.</p>

<h5>the solutions</h5>

<p><em>the terms</em></p>

<p>Towards this we have started by defining and including terms into
<a href="http://neurolex.org/wiki/Main_Page">NeuroLex</a> and working with the <a href="http://www.neuinfo.org/">NIF</a> folks to make the platform
for term curation and inclusion better. Dicom terms are already
<a href="http://neurolex.org/wiki/Category:DICOM_term">there</a>, statistic terms are being developed and other terms will
be automatically included and curated once the input interface is refined.</p>

<p><em>provenance and the domain agnostic data model</em></p>

<p>We have also developed a technology agnostic data model (not an object model) on
top of the <a href="http://www.w3.org/TR/prov-dm/">W3C-PROV model</a> called <a href="http://nidm.nidash.org/">NIDM</a> to capture not just
entities and their attributes (as is currently done in most databases) but also
how they were derived (provenance). By representing this model in <a href="http://www.w3.org/RDF/">RDF</a> for
example, one can immediately leverage federated <a href="http://www.w3.org/TR/rdf-sparql-query/">SPARQL queries</a>. Domain
specificity is included via key-value pairs and through object models
(collections). The power of this approach is that by storing data and their
provenance richly one has access to this information and therefore one can
always transform or re-purpose the content (e.g., for more efficient searches on
specific data types).</p>

<p><em>the domain specific object models</em></p>

<p>We can turn some objects into this model fairly easily. For example,
<a href="http://freesurfer.net/">FreeSurfer</a> directory structures and statistics. Similar efforts are
underway to capture info from neuropsych assessments (COINS->neurolex mapping),
SPM.mat/FEAT.fsf,  workflow/process provenance (taking into account prior
efforts - such as LONI/Kepler/Galaxy), XNAT/csv->PROV conversion and testing.</p>

<p><em>the app framework</em></p>

<p>Unfortunately a lot of the above, while absolutely necessary for establishing
standards don&#8217;t convey to the average user a sense of progress in datasharing or
the standards. The app framework leverages the above data model to create apps
for data filtering/processing/visualization. Our initial apps will be built as
IPython notebooks as they demonstrate not just the virtues of this approach, but
are themselves executable documents that people can share and reuse.</p>

<h5>the summary</h5>

<p>We have finally gotten to the stage in this neuroinformatics world where a
number of the software we use are fairly stable and the data and object models
are getting clearer in our heads.</p>

<p>What we are hoping is that a common vocabulary + an extensible format agnostic
data representation allows us to communicate without knowing whether you do rdf,
python, java, xml, javascript or other technologies. What we do need from the
community is to pitch in towards the curation, app building and testing efforts.</p>

<p>And we still have an important job of communicating it better to others.</p>
]]></content>
  </entry>

  <entry>
    <title type="html"><![CDATA[Help develop a winning strategy]]></title>
    <link href="http://satra.cogitatum.org/blog/2013/01/22/collaborativeproblemsolving/"/>
    <updated>2013-01-22T14:16:00-05:00</updated>
    <id>http://satra.cogitatum.org/blog/2013/01/22/collaborativeproblemsolving</id>
    <content type="html"><![CDATA[<p>A friend, Pablo Suarez, is running a competition to determine the best strategy
for winning in game called <a href="http://www.climatecentre.org/site/paying-for-predictions">&#8220;Pay for predictions&#8221;</a>. Our goal is to openly
and collaboratively help him, by providing some winning strategies for the game.
So spread the word!</p>

<!-- more -->


<p>The game intends to capture real life situations faced by the Red Cross Red
Crescent when dealing with natural disasters (e.g., flooding, earthquakes,
etc.,.). In keeping with the theme of collaborative problem solving as discussed
in the book <a href="http://www.amazon.com/Reinventing-Discovery-The-Networked-Science/dp/0691148902/ref=tmm_hrd_title_0">&#8220;Reinventing discovery&#8221; by Michael Nielsen</a> and the approach
taken to use <a href="http://www.gamesforchange.org/">games for social change</a>, the idea behind this shout-out is
to contribute solutions to the challenge.</p>

<p>Towards this, I have setup a <a href="https://github.com/satra/payforpredictions">Github repo</a> where everybody is welcome to
contribute. Do read the <a href="http://www.climatecentre.org/site/paying-for-predictions">guidelines</a> of the competition. It is possible
that there is a very simple solution, or one where a simulation can reveal the
optimal solution or many or none. The goal here is to demonstrate, much like the
<a href="http://polymathprojects.org/">&#8220;polymath project&#8221;</a>, that we can solve something quicker and possibly
better collaboratively. We understand that everybody may not have the time or
interest, but if you know someone who does, please let that person know.</p>

<p>Further, in keeping with the opensource spirit of scientific computation, I have
setup an <a href="http://nbviewer.ipython.org/">IPython notebook</a> to simulate the game. Based on the wonderful
work of the IPython team and contributors, this notebook can also deliver a
slideshow of the evolving solution. If you like Python, please contribute to the
notebook.</p>

<p><a href="http://nbviewer.ipython.org/urls/raw.github.com/satra/payforpredictions/master/payforpredictions.ipynb">Notebook</a></p>

<p><a href="http://slideviewer.herokuapp.com/url/raw.github.com/satra/payforpredictions/master/payforpredictions.ipynb#/">Slideshow</a></p>

<p>If you prefer something else, please add your code/document to your fork and
send a pull-request. Let us use the Github <a href="https://help.github.com/articles/using-pull-requests">pull-request system</a>,
<a href="https://github.com/satra/payforpredictions/issues/">issue tracker</a> and <a href="https://github.com/satra/payforpredictions/wiki/">wiki</a> for discussions.</p>

<p>So come one and come all and help develop a winning strategy. Go
<a href="https://github.com/satra/payforpredictions">fork and contribute</a>!</p>
]]></content>
  </entry>

</feed>