Skip to content

Latest commit

 

History

History
77 lines (36 loc) · 1.37 KB

File metadata and controls

77 lines (36 loc) · 1.37 KB

bme-bigdata

Biomedical engineering - Big Data analytics platforms

Setting up a data science environment

Anaconda Conda cheatsheet

Git Git cheatsheet, Git tutorial

Jupyter Notebook Tips&Tricks

Using Markdown 3 min tutorial

Data manipulation: NumPy Quick tutorial

LAB: Simple data exploration and making notes

Infrastructure configuration

CPU, GPU, Multinode clusters

AWS, GoogleCloud

Chef, Puppet

Docker, Vagrant

Parallel and distributed computing

Python, Dask

LAB: Design and execute algorithms on a cluster

Bigdata platforms

Hadoop ecosystem: HDFS, MapReduce, Impala, HBase

Hadoop ecosystem: Pig, Hive, Sqoop, Flume

Hadoop ecosystem: Hue, Mahout

Apache Spark, Apache Storm

Cloudera, Databricks

Databases

HDFS, HBase

neo4j, flockDB

Cassandra

Redis, RiakKV, RiakTS

Machine Learning with Large Datasets

Scikit-Learn

TensorFlow

Spark MLlib

LAB: ML with Large Datasets

Bigdata project

LAB: Real-time data processing

Stack: Redis, Apache Storm, Flask, d3js, TensorFlow