My Areas of Expertise

Selecting Dynamic Columns In Spark DataFrames (aka Excluding Columns)

I often need to perform an inverse selection of columns in a dataframe, or exclude some columns from a query. This is a very easy method, and I use it frequently when arranging features into vectors for machine learning tasks. import org.apache.spark.sql.Column // Create an example dataframe

Photography laser trigger trap

I've decided to try my hand at droplet and splash photography. Rather than using a manual method (setting up a constant drip method and trying to snap a picture the moment a droplet lands), I decided to go the engineered route (surprise!). I did a lot of searching

Cheetah in tall grass in the Maasai Mara

Copyright: James Conner Camera: Canon EOS 5D Mark III Lens: Canon 100-400L F4 with 1.4x Internal Extender Stats: 560mm/ƒ/5.6/1/1600s/ISO 640 Taken: March 5 2015

Intent lion cub in the Maasai Mara

Copyright: James Conner Camera: Canon EOS 5D Mark III Lens: Canon 100-400L F4 with 1.4x Internal Extender Stats: 366mm/ƒ/5.6/1/400s/ISO 400 Taken: February 28 2015

Joining Data Frames in Spark SQL

Data The data that I'm using for this test comes from Kaggle's [https://www.kaggle.com/] Titanic Project [https://www.kaggle.com/c/titanic]. The purpose of the Titanic project is to create a machine learning model to predict the survivability of the Titanic passengers. In

My Areas of Expertise © 2026