Skip to content
Das Digital Digest
Azure Budget
Initializing search
Home
Spark
Python
Git
SQL
DevOps
Azure
Airflow
Stream Processing
Enterprise Solutions
Projects
MkDocs
Miscellaneous
Das Digital Digest
Home
Spark
Spark
Concepts
Python vs PySpark vs Spark
Narrow vs Wide
Architecture
Cache & Persist
Broadcast Variables
Data Skew
Missing Values
Windows Functions
Partitioning
RDD vs DataFrame
PySpark Gotchas
Scala Guide
Interview Q&A
Shuffle
Databases & Catalogs
Common Topics
Hive Integration
Setup Guide
Hive
Hive
Concepts
PySpark Gotchas
PySpark Gotchas
Schema Inference Double-Read Penalty
Data Skew - The Silent Performance Killer
Small Files Performance Killer
High-Cardinality Partitioning Disaster
The Goldilocks Partition Problem
Suboptimal File Format Choices
Lazy Cache Evaluation Trap
Over-Caching Memory Waste
Wrong Storage Level Choices
Databricks
Databricks
Concepts
Catalogs
Authentication
ADLS Mount
Secret Scope
SQL
Magic Commands
Delta Lake
Project Example
ADLS Integration
ADLS Integration
Setup
Summary
Python
Python
Basics
Basics
Hello Python
Variables & Data Types
Control Flow
Lists, Tuples & Collections
Data Structures
Data Structures
Sets
Lists
Tuples
Tuples
Basic
Advanced
Dictionaries
Functions
Functions
Overview
Lambda
Built-in
Summary
Control Flow
Control Flow
Operators
If-Elif-Else
For Loops
Range
Enumerate
Advanced Topics
Advanced Topics
Main Function
Decorators
Arguments
With Statement
Error Handling
Debugging
Testing
Formatting
Diff & Patch
Assert & Methods
Integrations
Integrations
Linux
PySpark
Graph API
Scripts
Git
Git
Rebase
Detached Head
Hello Git
Git Origin
2. Core Commands - Daily Essentials
2. Core Commands - Daily Essentials
Repository Basics (init, clone)
The Fundamental Workflow (status, add, commit)
Viewing History (log, show, diff)
3. Branching Commands
3. Branching Commands
Branch Operations (branch, checkout, switch)
Merging Commands (merge, conflicts)
4. Remote Repository Commands
4. Remote Repository Commands
Understanding Remotes (remote commands)
Push, Pull, Fetch - Syncing Commands
GitHub Workflow Commands
5. Undo & Fix Commands
5. Undo & Fix Commands
Undoing Changes (restore, reset, revert)
Stash Commands
6. Advanced Commands
6. Advanced Commands
Rebase Commands
Cherry-pick Commands
Reflog Commands - The Safety Net
7. Inspection & Comparison
7. Inspection & Comparison
Advanced Inspection Commands
Finding Issues (bisect)
8. Cleaning & Maintenance
8. Cleaning & Maintenance
Cleanup Commands (clean, gc, prune)
9. Collaboration & Workflow
9. Collaboration & Workflow
Tags & Releases Commands
Workflow Patterns & Best Practices
10. Configuration & Productivity
10. Configuration & Productivity
Configuration & Aliases
11. Real-World Problem Solving
11. Real-World Problem Solving
Common Problems & Command Solutions
Complete Workflow Scenarios
12. Quick Reference
12. Quick Reference
Command Cheat Sheet
SQL
SQL
Basics
Advanced Topics
Windows Functions
SSIS
SSIS
Overview
Input/Output
Flat File Source
Project 1
Project 2
SSRS
Versions
DBT
DBT
Overview
Setup Project
Connections
DevOps
DevOps
GitHub Actions
GitHub Actions
Hello Actions
Sample Workflows
CI/CD For This Site
Azure DevOps
Azure DevOps
Pipelines
Self-Hosted Agents
Self-Hosted Agents
Windows
Windows Container
Linux Container
ADF CI/CD
ADF CI/CD
Implementation
Tools Comparison
Bicep
Docker & Kubernetes
Docker & Kubernetes
Fundamentals
Fundamentals
Docker Concepts
VS Code Docker Connection
Container Stacks
Container Stacks
Airflow
Airflow
Airflow on Docker
Kafka
Kafka
Confluent Kafka Stack
MongoDB
MongoDB
MongoDB with Docker
Big Data
Big Data
PySpark on Containers
Jupyter AllSparkNotebook
Bitnami Spark Cluster
Spark, Hive & MSSQL
Hadoop Cluster (Single & Multi Node)
Hive, Hadoop, Postgres & Presto
Hadoop & Hive with MySQL
Hive via Apache Official Images
Hive Concepts
Hadoop Concepts
Azure
Azure
Synapse
Synapse
Concepts
Pools
ETL Pipelines
Copy Data Tool
Integration Runtime
Database Types
Lake DB
Storage Evolution
CETAS
Data Factory
Data Factory
Projects
Local to ADLS
PySpark Warehouse
REST API
Monitoring
Copy Pipeline
Q&A
Service Principal and Managed Identity
Fabric
Fabric
Getting Started
Features
Features
Data Warehouse
Real-Time Analytics
Data Science
Data Factory
Development
Development
Direct Lake
PySpark & SQL
Pandas vs Spark
DataFrame Inspection
KQL
Spark Streaming
Projects
Projects
ETL with OPG
ETL with PySpark
E2E Project
Administration
Q&A
Airflow
Airflow
Concepts
DAG Anatomy
Hello Airflow
Airflow DBT Docker
Stream Processing
Stream Processing
Introduction
Event Hubs
Event Hubs
Overview
vs Kafka
Hello EventHubs
Local Emulator
Processing Options
Kinesis Integration
Enterprise Solutions
Enterprise Solutions
Microsoft 365
Microsoft 365
SharePoint
SharePoint
Events
Forms vs PowerApps
Mini Role
Evolution
vs Other ECM
Farm
Farm
Consolidation
2007 Upgrade
2016 Upgrade
Migration
Migration
SPMT
Documentum
WSS3
Integration
Integration
OAuth Python
Licensing
Power Platform
Power Platform
Hello Platform
Admin Central
Components
Components
Hello Dataverse
Hello Dynamics
Model Apps
Power Pages
Power Automate
Calculation Groups
Features
Features
Document Intelligence
Document Intelligence
Overview
Automation
With Cognitive Search
Q&A
Syntex
Syntex
Enable
Document Library
ECM Capture
Integration
Integration
Custom Connectors
On-Premise Gateway
Security
Q&A
MongoDB
MongoDB
Commands
File Storage
Comparisons
Projects
Projects
Azure Sky Weather
Azure Sky Weather
Overview
Ingestion
Ingestion
HTTP Triggered
HTTP Triggered
Functions
Azure Functions
Timer Triggered
Transformation
StreamKraft
StreamKraft
Overview
Sparkzure
Sparkzure
Overview
JSON Projects
JSON Projects
Validator
Validator
Python
Azure Function
Azure SDK
To Parquet
To Hive
ETL Projects
ETL Projects
Currency Predictor
CSV to MSSQL
Azure Blob Migration
Local Python to Blob
MongoDB CMS
SSRS SSIS SharePoint
Databricks E2E
Trading
Trading
SPY ETF Recommender
Setup Guides
Setup Guides
Install Scala
Hadoop Jars
Microsoft OpenJDK
MkDocs
MkDocs
Github Deployment
Local Setup
Miscellaneous
Miscellaneous
Markdown
Markdown
Basics
Colors
PDF Export
Cloud
Cloud
GCP Usage
Azure Budget
System
System
Ubuntu Benefits
VMware Free
VS Code Tips
Background Apps
Data
Data
Datasets
Fact vs Dimension
Running Stable Diffusion Locally
How to set a Azure Budget
¶
Back to top