← Projects

// academic · cloud computing

AWS Cloud Computing Labs

Distributed computing with AWS EMR, Hadoop, Hive, and Pig

Coursework covering distributed data processing on AWS. Labs covered cluster configuration, MapReduce job design, Hive query optimization, and large-scale ETL pipeline patterns using Pig.

AWS EMR

Cluster configuration, node sizing, and managed Hadoop framework deployment on Elastic MapReduce.

Hadoop & MapReduce

MapReduce job design, combiner functions, shuffle optimization, and distributed data locality principles.

Hive

HiveQL query writing, table partitioning, bucketing, and query execution plan analysis.

Pig

Pig Latin scripting for multi-step ETL pipelines, UDF usage, and data transformation workflows.

Skills

AWS EMRHadoopHivePigMapReduceDistributed ComputingCluster ConfigurationData Pipeline Design