Upcoming Meetups

Deep dive Avro and Parquet – Read Avro/Write Parquet using Kafka and Spark

Tuesday, Apr 5, 2016, 7:00 PM

NJ Big Data and Hadoop Meetup
3525 Quakerbridge Rd #1400 IBIS Office Plaza, Suite 1400 Hamilton Township, NJ

49 Hadoop Attending

Agenda1) Avro and Parquet – When and Why to use which format? 2) Data modeling – Avro and Parquet schema 3) Workshop – Read Avro input from Kafka – Transform data in Spark – Write data frame to Parquet – Read back from ParquetSpeakersTimothy Spann – Sr. Solutions Architect, airisDATA Srinivas Daruna – Data Engineer, airisDATA

Check out this Meetup →

 

Scala + Spark SQL Workshop

Thursday, Mar 10, 2016, 7:00 PM

NJ Big Data and Hadoop Meetup
3525 Quakerbridge Rd #1400 IBIS Office Plaza, Suite 1400 Hamilton Township, NJ

42 Hadoop Attending

Agenda1) Scala and Spark – Why functional paradigm? – Fn prog fundamentals – A Prog feature and hands-on (e.g functions, collections, pattern matching, implicits – speaker choice) – Tie it back to Spark2) Spark SQL – data frames and data sets – logical and physical plan – hands-on workshopSpeakers Rajiv Singla – Data Engineer, airisDATA Kristin…

Check out this Meetup →

NJ Data Science – Apache Spark

Princeton, NJ
360 Data Scientists

Large Scale Data Analysis to improve Business Profitability• Framework – Apache Spark, Hadoop,• Machine learning – SparkML, H20, R• Graph Processing – GraphX, Titan, Neo4J…

Check out this Meetup Group →

 

Workshop – How to Build Recommendation Engine using Spark 1.6 and HDP

Thursday, Mar 17, 2016, 7:00 PM

Princeton University – Lewis Library Rm 122
Washington Road and Ivy Lane, Princeton, NJ 08544 Princeton, NJ

53 Data Scientists Attending

Agendaa) Hands-on – Build a Data analytics application using SPARK, Hortonworks, and Zeppelin. The session explains RDD concepts, DataFrames, sqlContext, use SparkSQL for working with DataFrames and explore graphical abilities of Zeppelin.b) Follow along – Build a Recommendation Engine – This will show how to build a predictive analytics (MLlib) …

Check out this Meetup →

 

 

 

 

NJ Data Science Meetup – How to Build Data Analytics Applications with Spark and Hortonworks

Workshop – How to build data Analytics app & Reco Engine using Spark + Horton

Thursday, Mar 17, 2016, 7:00 PM

Princeton University – Lewis Library Rm 122
Washington Road and Ivy Lane, Princeton, NJ 08544 Princeton, NJ

16 Data Scientists Attending

Agendaa) Hands-on – Build a Data analytics application using SPARK, Hortonworks, and Zeppelin. The session explains RDD concepts, DataFrames, sqlContext, use SparkSQL for working with DataFrames and explore graphical abilities of Zeppelin.b) Follow along – Build a Recommendation Engine – This will show how to build a whole web app with predictive…

Check out this Meetup →

Scala Days 2016 NYC

Scala Days 2016 Schedule have been announced!

Highlights:

Beyond Shuffling: Scaling Apache Spark
by Holden Karau @holdenkarau

Scala: The Unpredicted Lingua Franca for Data Science
by Andy Petrella @noootsab and Dean Wampler@deanwampler

Build a Recommender System in Apache Spark and Integrate It Using Akka
by Willem Meints @willem_meints

Implementing Microservices with Scala and Akka
by Vaughn Vernon @VaughnVernon

Microservices based off Akka cluster at iHeartRadio
by Kailuo Wang @kailuowang

Building a High-Performance Database with Scala, Akka & Spark
by Evan Chan @evanfchan

Large scale graph analysis using Scala and Akka
by Ben Fonarov @chuwiey

Distributed Real-Time Stream Processing: Why and How
by Petr Zapletal @petr_zapletal

Deep Learning and NLP with Spark
by Andy Petrella @noootsab

 

Fans of Scala, Spark, Big Data, Machine Learning, Real-time computing, Stream processing, functional programming and reactive programming all have great talks to choose from.   Tons of great speakers including the developer of Spark Notebook, top people in Scala and a good representation from leading industry users.

IMG_20150929_082701

Spark Summit East 2016 Schedule Released

Head on over to Spark Summit East 2016.   The schedule of talks is now up.

The highlights include a talk on Spark 2.0, giving us a peak into the future.

The developer tracks look pretty amazing to me:

There are also great enterprise track talks:

 

Scala Days New York 2016

Scala Days

May 9th-13th, 2016 New York

Scala Days will be in NYC on May 9th through May 11th, 2016.  It will be conveniently located mid-town at AMA Executive Conference Center.   If you are a Scala developer, Spark developer, a Java developer looking to learn or a Big Data enthusiast this will be the place to be in May.  More than a few presentations from the last few Scala Days have entered my rotation of must read.

I will be providing in-depth coverage of the event.   This along with Spark Summit will be on my must do list for 2016.

The two days immediately following the event will be some awesome training opportunities for Scala, Spark, SMACK Stack, Microservices, Reactive programming and Akka.    The price goes up January 20th and then again on March 16th.   So put in those requests ASAP.

The program will be posted in February, but it’s guaranteed to be interesting with topics on Scala, Akka and Spark pretty much assured.   Typesafe is involved, so you know there will be good quality content.

Check out some videos from the 2015 Scala Days in San Francisco.

Bold Radius had an awesome talk in 2015 at the event on Akka.   Be on hand so you  can ask the experts and speakers questions and get more in-depth knowledge on these advanced topics.

Looking at previous agendas like the one in Amsterdam will be a good preview for you.   Looking at the topics like “Scala – The Real Spark of Data Science”, give you an idea that this will be a very worthwhile conference.

I hope to see you there, I’ll post some more details when they come in.

Check with your local meetup for a discount code.   If you are in New Jersey, see me at the NJ Data Science / Spark Meetup and I’ll get you a code.

If you are curious about Scala, check out Typesafe Activator and SBT to quickly get up and running with the full development tool kit in a rapid manner.

 

 

 

Free Hadoop, Spark, Big Data Training

Free Hadoop Training List

http://bigdatauniversity.com/courses/introduction-to-solr/

Spark Fundamentals I

 

Spark Fundamentals II

http://bigdatauniversity.com/courses/text-analytics-essentials/

Hadoop Fundamentals I

Big Data Fundamentals

Hadoop Developer Day Event

Introduction to Pig

 

http://www.coreservlets.com/hadoop-tutorial/

Accessing Hadoop Data Using Hive

 

Using HBase for Real-time Access to your Big Data – Version 2

 

Introduction to Scala

https://www.udemy.com/hadoopstarterkit/?dtcode=VAN2bNB4geMV

https://www.udemy.com/big-data-basics-hadoop-mapreduce-hive-pig-spark/?dtcode=6oLh2qk4geNz

https://www.udemy.com/data-analytics-using-hadoop-in-2-hours/?dtcode=Wa8s2vV4geNW

https://www.udemy.com/data-analytics-using-hadoop-in-2-hours/learn/

https://www.udemy.com/hadoopstarterkit/learn

https://developer.yahoo.com/hadoop/

http://adooputorialraining180.teachable.com/courses/hadoop-free-course-training-tutorial

http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/

https://www.mapr.com/services/mapr-academy/big-data-hadoop-online-training

https://www.mapr.com/services/mapr-academy/hadoop-essentials

https://www.mapr.com/services/mapr-academy/Developing-Hadoop-Applications

https://www.mapr.com/training/hadoop-demand-training/dev-325

https://www.mapr.com/services/mapr-academy/developing-hbase-applications-basics-training-course-on-demand

https://www.mapr.com/services/mapr-academy/developing-hbase-applications-advanced-training-course-on-demand

https://www.udemy.com/hadoopstarterkit/learn/

https://www.mapr.com/services/mapr-academy/apache-spark-essentials

https://www.mapr.com/services/mapr-academy/build-monitor-apache-spark-applications

https://www.mapr.com/services/mapr-academy/apache-hive-essentials-training-course-on-demand

https://www.mapr.com/services/mapr-academy/apache-pig-essentials-training-course-on-demand

https://www.mapr.com/services/mapr-academy/apache-drill-training-course-on-demand

https://www.mapr.com/services/mapr-academy/apache-drill-architecture-training-course-on-demand

MapReduce and YARN

https://www.udemy.com/big-data-basics-hadoop-mapreduce-hive-pig-spark/

https://www.edx.org/course/scalable-machine-learning-uc-berkeleyx-cs190-1x

https://www.edx.org/course/introduction-big-data-apache-spark-uc-berkeleyx-cs100-1x

Pivotal HDB

http://academy.pivotal.io/course/141303

http://www.cloudera.com/content/www/en-us/resources/training/cloudera-essentials-for-apache-hadoop-the-motivation-for-hadoop.html

 

Tools

Cloudera VM Download

http://www.cloudera.com/content/www/en-us/downloads/quickstart_vms/5-5.html

Cask VM Download

http://www.cloudera.com/content/www/en-us/downloads/cdap.html