My Areas of Expertise

How to run TensorFlow with GPU on Windows 10 in a Jupyter Notebook

Install CUDA ToolKit The first step in our process is to install the CUDA ToolKit, which is what gives us the ability to run against the the GPU CUDA cores. Because TensorFlow [https://www.tensorflow.org/] is very version specific, you'll have to go to the CUDA ToolKit

Transpose data with Spark

A short user defined function written in Scala which allows you to transpose a dataframe without performing aggregation functions.

Convert Spark Vectors to DataFrame Columns

Vectors are typically required for Machine Learning tasks, but are otherwise not commonly used. Sometimes you end up with an assembled Vector [https://spark.apache.org/docs/latest/ml-features.html#vectorassembler] that you just want to disassemble into its individual component columns so you can do some Spark SQL work,

Pivoting data with Spark

One of the common data engineering tasks is taking a deep dataset and turning into a wide dataset with some sort of aggregation function. Let's take a quick look at an example dataset to see why we would want to perform this action. Our goal: To determine if

Renaming All Columns In A Spark DataFrame

Here's an easy example of how to rename all columns in an Apache Spark DataFrame. Tehcnically, we're really creating a second DataFrame with the correct names. // IMPORT DEPENDENCIES import org.apache.spark.sql.SparkSession import org.apache.spark.sql.functions._ import org.apache.spark.sql.{SQLContext,