Mar 27, 2013

Sr Hadoop Engineer - San Francisco, CA - 12 Month Contract

Urgently Send Resumes to rituraj@x-istech.com
Direct client requirement. We can schedule Immediate Interview

Hadoop Engineer

Need a Strong Java/Hadoop Architect. You should be technically very strong and have a good depth of knowledge. ETL,PL/SQL etc -you should understand technology as you will
be counted on to give suggestions on Technology decision.

The ideal candidate will be a seasoned Java programmer with strong Computer Science background (sorting, hashing, etc.) on the Unix platform; 
also familiar with one or more functional languages (eg. Scala, Clojure) and ideally experienced with Hadoop MapReduce in particular. 
He/she will also have a keen interest and/or actual hands-on experience with multiple distributed computing/storage frameworks (eg. Storm, Cassandra, HBase, Spark) 
as well as Hadoop-specific DSLs (eg. Scoobi, Scrunch, Cascalog), and be  motivated to solve complex problems involving large (long and wide) data sets, 
efficiently and scalably.

Responsibilities

Design and implement a Hadoop-based ETL process.
Work with data scientists to re-implement machine-learned algorithms in MapReduce (coding in Java and/or higher level DSLs),or in other (appropriate) distributed 
frameworks.
Design and implement a real-time event streaming system (with Hadoop integration).
Work with project managers and back-end/middle-tier engineers to develop and target data flow APIs.


Requirements

- Languages: Java.
- Scripting languages: Perl, Python or similar.
- CS algorithms: sorting, hashing, recursion, trees, graphs, etc.
- Hadoop core: MapReduce, HDFS.
- Hadoop utilities: Oozie, ZooKeeper.
- Relational algebra (SQL).
- Unix shell programming (sh, bash, csh, zsh): pipes, redirection, process control, etc.
- Unix pipeline utilities: awk, sed, grep, find, etc.
- Unix system utilities: cron, at, kill, ssh, sftp, etc.
- Regular expressions.


Desirable skills

- Hadoop cluster administration: queues, quotas, replication, block size, decommission nodes, add nodes, etc.
- JVM-based functional languages: Scala, Clojure.
- Hadoop pipeline frameworks: Streaming, Crunch, Cascading.
- Hadoop productivity frameworks: Scrunch, Scoobi.
- Hadoop query languages: Pig, Hive, Scalding, Cascalog, PyCascading.
- Hadoop libraries: Mahout.
- Alternative HDFS-based computing frameworks: Spark (Pregel).
- Serialization frameworks: Avro, Thrift, Protocol Buffers.
- Distributed databases: Cassandra, Voldemort, HBase, MongoDB, CouchDB.
- Real-time event streaming: Storm, S4, InfoSphere Streams (IBM).
- Statistics, data mining or machine learning: expectation, regression, clustering, etc.


Useful background skills

- Specific experience with the Cloudera Hadoop distribution.
- Unix system administration: sudo, mountd, bind, sendmail, etc.
- Database administration: MySQL, SQLite, Oracle, or similar.

Thanks!


Rituraj Borooah
___________________________________________________________

Also Check: 
Google GroupXrecnet - An IT Recruiters Network for US IT requirements.

--
You received this message because you are subscribed to the Google Groups "US_Jobs&Consultants" group.
To unsubscribe from this group and stop receiving emails from it, send an email to us_jobsnconsultants+unsubscribe@googlegroups.com.
To post to this group, send email to us_jobsnconsultants@googlegroups.com.
Visit this group at http://groups.google.com/group/us_jobsnconsultants?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

No comments:

Post a Comment