Let’s explore the differences between single quotes, double quotes and backticks in SQL and Apache Spark.
In Spark SQL, single quotes, double quotes, and backticks mean different things and are not interchangeable. Knowing which to use where prevents subtle bugs and odd parse errors.
Single quotes
Single quotes are for string (and date/timestamp) literals in SQL text.
- Use single quotes around literal values in SQL queries:
WHERE country = 'US',SELECT 'hello' as msg.
In the Spark Scala or Python API, single quotes also have a special meaning in Spark’s Domain Specific Language (DSL).
Double quotes
Double quotes have two separate uses depending on context.
- In Spark SQL (ANSI mode enabled, which is default in many recent versions), double quotes are used for delimited identifiers (column/table names), especially when they contain uppercase letters, spaces, or reserved keywords, for example
SELECT "First Name" FROM people. - In code strings (Scala, Python, etc.), double quotes are just normal language string delimiters and do not change SQL semantics; they only matter for how you escape quotes inside the host language.
Guidelines:
- For SQL identifiers that need quoting and when ANSI behavior is desired, prefer double quotes in your SQL text.
- For plain string literals in SQL, do not use double quotes; use single quotes instead.
Backticks
In Spark SQL, backticks ( `` ) are used to reference identifiers (like column or table names) that contain special characters, spaces, or reserved keywords.
Backticks are Spark’s classic way to quote identifiers in SQL.
- The column/table name contains spaces:
SELECT `first name` FROM people; - The name contains special characters (e.g.,
-,#,$):SELECT `order-id` FROM orders; - The name is a reserved keyword in SQL:
SELECT `select` FROM commands; - Backticks are especially common in Spark’s DataFrame API and remain widely used and supported.
Guidelines:
- Use backticks (or double quotes in ANSI mode) only around identifiers, never around data values.
- When in doubt or writing portable Spark SQL that must tolerate odd column names, consistently quote identifiers with backticks.
Whenever possible, avoid using spaces, special characters or reserved keywords as identifiers or aliases for columns and tables, because those can be problematic to both people and query engines.
Examples in Spark SQL
-- Column with space
SELECT `full name`, age
FROM employees;
-- Column with dash
SELECT `order-id`, amount
FROM sales;
-- Reserved keyword as column name
SELECT `date`, revenue
FROM reports;Using Backticks in PySpark
When writing SQL queries in PySpark:
df.createOrReplaceTempView("employees")
# Query with backticks
spark.sql("""
SELECT `full name`, age
FROM employees
""").show()
Alternative in DataFrame API
Instead of backticks, you can use col() from pyspark.sql.functions:
from pyspark.sql.functions import col
df.select(col("full name"), col("order-id")).show()This avoids SQL parsing issues and is safer for programmatic column selection.
Summary table: single quotes, double quotes and backticks per SQL dialect
Conclusion
Choosing between single quotes, double quotes, and backticks in SQL is ultimately about clarity and intent: single quotes should always express literal values, double quotes (or backticks, depending on dialect) should clearly identify objects like columns and tables, and mixing the two roles is what leads to confusion and bugs. By consistently using single quotes for strings and understanding when your database expects double quotes or backticks for identifiers, you make your queries more readable, more portable across systems, and far easier to debug. With these conventions in place, the quoting rules stop being a source of frustration and instead become a simple, reliable part of how you communicate with your database.

Leave a Reply