Top 12 Tutorials and Workshops for Hadoop and related big data technologies

 

 

 

Using Spring-XD to Load Files into GemFire XD

Spring XD Scripts

My General Setup Script (I save it in setup.xd and load it via  script –file setup.xd)

had config fs –namenode hdfs://pivhdsne:8020
admin config server http://localhost:9393
hadoop fs ls /
stream list

The Script for Loading a File into GemFireXD via Spring-XD

stream create --name fileload --definition "file --dir=/tmp/xd/input/load --outputType=text/plain |  jdbc --tableName=APP.filetest --columns=id,name" --deploy

 

Spring XD Configuration for GemFire XD

Copy the GemFire XD JDBC Driver to Spring-XD (might need tools.jar as well)

cp /usr/lib/gphd/Pivotal_GemFireXD_10/lib/gemfirexd-client.jar /opt/pivotal/spring-xd/xd/lib/

Modify the Sink’s JDBC properties to point to your Gemfire XD, if you are using the Pivotal HD VM and install Spring-XD with Yum (sudo yum update spring-xd), this is the location:

/opt/pivotal/spring-xd/xd/config/modules/sink/jdbc/jdbc.properties
url = jdbc:gemfirexd://localhost:1527
username = gfxd
password = gfxd
driverClassName = com.pivotal.gemfirexd.jdbc.ClientDriver

For Peer Client Driver you need more files from GemFireXD Lib (the .so binaries), linking is probably a good idea.

 

GemFire XD Setup

gfxd
connect client 'localhost:1527';

create table filetest (id int, name varchar(100)) REPLICATE PERSISTENT;
select id, kind, netservers from sys.members; 
select * from filetest;

Spring XD Commands

stream list

show your jobs

 

Reference:

 

Quick Look: Spring XD

 

Spring XD:  A really quick way to batch process data from an easy to run shell.   It’s very easy to setup to run a one node version of XD and use it on Windows, Mac or Linux.   Looks like a great tool.

 

 

 

Quick Tip: Spring REST Utility for Current HTTP Request

Utility Method for Spring REST

 public static HttpServletRequest getCurrentRequest() {
     RequestAttributes requestAttributes = RequestContextHolder.getRequestAttributes();
     Assert.state(requestAttributes != null, "Could not find current request via RequestContextHolder");
     Assert.isInstanceOf(ServletRequestAttributes.class, requestAttributes);
     HttpServletRequest servletRequest = ((ServletRequestAttributes) requestAttributes).getRequest();
     Assert.state(servletRequest != null, "Could not find current HttpServletRequest");
     return servletRequest;
 }

Sometimes it’s easier to get the underlying Servlet request to get some headers or variables.

 final String userIpAddress = getCurrentRequest().getRemoteAddr();
 final String userAgent = getCurrentRequest().getHeader("user-agent");

This is used in the simple REST service using HTTP Post verb @ the awesome CloudFoundry:

(Source)

Tool for Creating Your Test JSON.

Spring Boot Documentation

 

Quick Links: Modern Tools for Modern Big Data Applications

Modern Front-End Tools (Front-End Build Wars)

A ton of great tools for running tons of tasks and building modern front-ends.   These are to the front what maven and gradle are to the back.
Build Wars: Gulp vs Grunt
Spring-cleaning Unused CSS With Grunt, Gulp, Broccoli or Brunch
Yeoman – Modern workflows for modern webapps
Environment-specific Builds With Grunt, Gulp or Broccoli
Gulp, Grunt, Whatever – Pony Foo
Brunch | ultra-fast HTML5 build tool
Modern Web Apps (Java 8 + AngularJS + Spring 4 + Gradle)
trackr: An AngularJS app with a Java 8 backend – Part I | techdev Solutions
trackr: An AngularJS app with a Java 8 backend – Part II | techdev Solutions
Pagination with Spring Data and HATEOAS in an AngularJS app | Patrick Grimard’s Java Blog
Modern Rapid Development and Production Deployment Tools (Spring Boot, Spring 4)
joshlong/bookmarks
joshlong/boot-examples
joshlong/boot-it-up
RestController in Spring 4.0
Spring 4 Tutorials
Spring Transaction Management
Spring Framework 4 on Java 8 // Speaker Deck
Pivotal presentations channel
Big Data (Spring with Hadoop)
SpringOne2GX 2013 Replay: Real Time Analytics with Spring
SpringOne2GX 2013 Replay: Hadoop – Just the Basics for Big Data Rookies
dzone.com – Seven Databases in Seven Weeks: Hbase, Day 2
Apache Solr real-time live index updates at scale with Apache Hadoop | Java Code Geeks
Modern Cloud Development and Hosting (Cloud Foundry)
Run Your Java Code on Cloud Foundry – Andy Piper (Pivotal)
clds.sdsc.edu/sites/clds.sdsc.edu/files/Pivotal_DS_supercomputing2013.pdf
Run the site on Cloud Foundry · spring-io/sagan Wiki
Pivotal Resources 
A real-time architecture using Hadoop and Storm @ JAX London
Spring Boot with Groovy and Friends
HyperLogLog – Wikipedia, the free encyclopedia
The lost outpost | a weblog by Andy Piper about technology, photography, and life
Installing a Hadoop Cluster with three Commands | codecentric Blogcodecentric Blog
Using Ambari Blueprints to automatically provision and install the Lambda Architecture | codecentric Blogcodecentric Blog
3 lessons in database design from the team behind Twitter’s Manhattan — Tech News and Analysis
Bloom filter – Wikipedia, the free encyclopedia
Hadoop Tips: Bloom Filters in HBase and Chrome
Enterprise-ready production-ready Java batch applications powered by Spring Boot | codecentric Blogcodecentric Blog
MapReduce Introduction – Tutorial
Scaling SQL with Redis – David Cramer’s Blog
HBase – Apache HBase™ Home
Work with Hadoop and NoSQL Databases with Toad for Cloud – ReadWrite
Google Research Publication: MapReduce
Overview | Postgres-XL
headissue/cache2k-benchmark
In 45 Min, Set Up Hadoop (Pivotal HD) on a Multi-VM Cluster & Run Test Data | Pivotal P.O.V.
BOSH Components | Cloud Foundry Docs
Pivotal Open Source Hub (San Francisco, CA) – Meetup
WebScaleSQL | “We’re Gonna Need A Bigger Database”
Eight Terminal Utilities Every OS X Command Line User Should Know · mitchchn.me
Report: NoSQL Databases – Providing Extreme Scale and Flexibility — Gigaom Research
HDFS Architecture
PostgreSQL: PostgreSQL 9.4 Beta 1 Released
Using Redis at Pinterest for Billions of Relationships | Pivotal P.O.V.
Replication, Clustering, and Connection Pooling – PostgreSQL wiki
Hadoop Content on InfoQ
dzone.com – JSON on Hadoop Example for Extending HAWQ Data Formats Using Pivotal eXtension Framework (PXF)
When should I use Greenplum Database versus HAWQ? | PivotalGuru
dzone.com – In 45 Min, Set Up Hadoop (Pivotal HD) on a Multi-VM Cluster & Run Test Data
PostgreSQL: Documentation: 9.3: High Availability, Load Balancing, and Replication
PostgreSQL – Wikipedia, the free encyclopedia
The Hadoop Ecosystem Table
Pivotal Hadoop Distribution and HAWQ Realtime Query Engine | Architects Zone
momjian.us/main/writings/pgsql/replication.pdf
dzone.com – EMC World 2014: Pivotal and Isilon Take Hadoop Prime Time in the Enterprise
dzone.com – Pivotal ship Hadoop distro complete with ‘world’s fastest’ SQL query engine
dzone.com – Pivotal Hadoop Distribution and HAWQ Realtime Query Engine
HAWQ | Pivotal P.O.V.
Cloud Foundry Eclipse Plugin | Pivotal Docs
dzone.com – Transform Your Skills: Simple Steps to Set Up SQL on Hadoop
dzone.com – What Makes Hadoop So Important & How To Gain Business Value From It
Search Results For: hadoop pivotal
Cloud Foundry Environment Variables | Cloud Foundry Docs
PCC Installation Checklist | Pivotal Docs
Pivotal Web Services Documentation | Pivotal CF Docs
Hadoop Tutorials from Hortonworks
Tips for Java Developers | Cloud Foundry Docs
Spring for Apache Hadoop
Spring Framework Reference Documentation
16. Web MVC framework
cloudfoundry/java-test-applications
Getting Started with Pivotal Web Services | Pivotal CF Docs
6 Easy Steps: Deploy Pivotal’s Hadoop on Docker | Pivotal P.O.V.
cloudfoundry/java-buildpack-dependency-builder
Pivotal-Field-Engineering/checkgreenplum
Pivotal-Field-Engineering/groovy-sqlfire
OpenTSDB – A Distributed, Scalable Monitoring System
Pivotal-Field-Engineering/pxf-field
Pivotal-Field-Engineering/tcollector
Pivotal-Field-Engineering/pmr-common
Pivotal-Field-Engineering/cf-scale-boot
Pivotal-Field-Engineering/gemfire-bosh-release
Big Data Benchmark
vFabric/vFabricReferenceArchitecture
vFabric
Pivotal-Field-Engineering/benchmark
Pivotal-Field-Engineering/retail-demo-xd
gopivotal/spring-travel-app
gopivotal/SpringTraderHD
gopivotal/pivotal-samples
Pivotal Web Services

Quick Links: Hibernate / JPA Testing Strategies

http://www.mkyong.com/hibernate/how-to-configure-log4j-in-hibernate-project/

http://damnhandy.com/2008/08/20/hibernate-and-and-the-found-two-representations-of-same-collection-error/

http://www.infoq.com/articles/testing-in-spring

https://github.com/michaelyaakoby/testfun

http://www.oracle.com/technetwork/articles/java/unittesting-455385.html

http://webcache.googleusercontent.com/search?q=cache:ZXg8qoRBYcMJ:struberg.wordpress.com/2012/03/27/unit-testing-strategies-for-cdi-based-projects/+&cd=1&hl=en&ct=clnk&gl=us

https://github.com/struberg/lightweightEE/blob/master/backend/src/test/java/de/jaxenter/eesummit/caroline/backend/test/CdiContainerTest.java

http://blog.novatec-gmbh.de/unit-testing-jee-applications-cdi/

https://ops4j1.jira.com/wiki/display/PAXEXAM3/Pax+Exam;jsessionid=884744B7143030CD229BB9105F0F482D

https://ops4j1.jira.com/wiki/display/PAXEXAM3/Maven+Dependencies

http://ocpsoft.org/jboss/cdi-powered-unit-testing-using-arquillian/

 

https://github.com/jbosstm/quickstart/tree/master/ArjunaJTA/maven
http://planet.jboss.org/post/cdi_powered_unit_testing_using_arquillian0

 

http://stackoverflow.com/questions/6469751/testing-an-ejb-with-junit
http://ejb3unit.sourceforge.net/
http://www.junitee.org/
http://www.caucho.com/articles/JavaEE6_Testing.pdf
http://myfaces.apache.org/extensions/cdi/
http://deltaspike.apache.org/container-control.html
http://ctpjava.blogspot.com/2009/10/unit-testing-ejbs-and-jpa-with.html
https://www.42lines.net/2011/11/21/adding-jpahibernate-into-the-cdi-and-wicket-mix/
http://www.murraywilliams.com/2012/04/maven-and-jpa-programming/
http://www.brainhemorage.com/?p=292
http://www.mastertheboss.com/cdi/cdi-and-jpa-tutorial
http://www.slideshare.net/agoncal/injection-with-cdi-in-15-minutes-29358389

 

Bootstrap CDI
http://antoniogoncalves.org/2011/01/12/bootstrapping-cdi-in-several-environments/

 

http://needle.spree.de/overview

 

https://community.jboss.org/wiki/CreatingUnitTestsWithWeldAndJunit4

 

http://deltaspike.apache.org/documentation.html#testing-snapshots

 

http://planet.jboss.org/post/cdi_powered_unit_testing_using_arquillian

 

http://jglue.org/cdi-unit/

 

http://www.javaworld.com/article/2099020/open-source-tools/integrating-arquillian-and-jbehave.html?source=IFWNLE_ifw_java_2014-02-25#tk.rss_all

 

https://github.com/matzew/spring-cdi-bridge

 

Test REST
https://opencast.jira.com/wiki/display/MH/Unit+Testing+REST+Endpoints

 

https://code.google.com/p/rest-assured/

 

http://www.tmro.net/2009/03/unit-test-jax-rs-using-java-6-and-junit-4/

 

http://hc.apache.org/

 

http://howtodoinjava.com/2013/08/03/jax-rs-2-0-resteasy-3-0-2-final-client-api-example/

 

REST Mock
https://github.com/robfletcher/betamax

 

http://download.eclipse.org/recommenders/updates/stable/

 

https://github.com/forge/core#jboss-forge-20
http://download.jboss.org/jbosstools/builds/staging/jbosstools-forge_master/all/repo/

 

http://projectlombok.org/