<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Functional Programming on KaPa Consulting</title><link>https://kapa-consulting.sk/categories/functional-programming/</link><description>Recent content in Functional Programming on KaPa Consulting</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Fri, 10 Apr 2026 10:00:00 +0100</lastBuildDate><atom:link href="https://kapa-consulting.sk/categories/functional-programming/index.xml" rel="self" type="application/rss+xml"/><item><title>PySpark Functional Programming: Stop Writing Imperative Spark Pipelines</title><link>https://kapa-consulting.sk/post/2026/04/2026-04-10-pyspark-functional-programming-intro/</link><pubDate>Fri, 10 Apr 2026 10:00:00 +0100</pubDate><guid>https://kapa-consulting.sk/post/2026/04/2026-04-10-pyspark-functional-programming-intro/</guid><description>&lt;p>In my recent project I ran into a situation where I had to review a set of PySpark notebooks in Microsoft Fabric — 14 notebooks, some of them over 3000 lines long, hundreds of cells, multiple data domains crammed into a single file. The code worked, but reading it felt like archaeology. Every notebook started the same way: &lt;code>df = spark.read...&lt;/code>, then &lt;code>df = df.withColumn(...)&lt;/code> repeated dozens of times, sprinkled with &lt;code>display(df)&lt;/code> calls and bare &lt;code>except:&lt;/code> blocks. I kept asking myself — how did we end up writing Spark code like this?&lt;/p></description></item></channel></rss>