Chapter 6: Manipulating Strings

The “Messy Data” Crisis

Scene: The Computer Lab. Chaitanya is staring at a spreadsheet on his screen, looking horrified.

Chaitanya: Ma’am, the data from the new online admission form is a disaster!

  • Some students typed their names in all caps: RAHUL.
  • Some used all lowercase: simran.
  • Some added random spaces: Chaitanya .
  • And someone typed their phone number as Eight-Zero-Zero....

Chaitanya: I have to fix 500 entries manually before we can print the ID cards!

Aditi Ma’am: Step away from the keyboard, Chaitanya. You are trying to clean the room with a toothbrush. You need a power washer.

Chaitanya: A power washer?

Aditi Ma’am: Python’s String Manipulation tools. Up until now, you’ve treated strings like simple labels. But strings are actually complex sequences that can be sliced, searched, and scrubbed clean.

String Literals (The Rules of Text)

Aditi Ma’am: First, let’s talk about how we write text. You know about single quotes ' ' and double quotes " ". But what if you need to use a quote inside a string?

Chaitanya: Like writing It's time?

Aditi Ma’am: Exactly. If you write 'It's time', Python thinks the string ends at the second quote and crashes. You have to use an Escape Character. The backslash \.

Python

>>> print('It\'s a "School System" error.')
It's a "School System" error.

Aditi Ma’am: The \ tells Python: “Ignore the next character’s special meaning; just treat it as text.”

Common Escape Characters:

  • \': Single Quote
  • \": Double Quote
  • \t: Tab (Indentation)
  • \n: Newline (Pressing Enter)
  • \\: Backslash (If you actually need to print a \)

Chaitanya: \n is useful. I can print a whole list in one line of code.

Python

print('Name:\tChaitanya\nClass:\t10th')

Output:

Name:   Chaitanya
Class:  10th

Raw Strings (The “Ignore Me” Mode)

Aditi Ma’am: Sometimes, you have so many backslashes (like in a Windows file path C:\Users\Name) that escaping them is annoying. You can use a Raw String. Just put an r before the quote.

Python

print(r'C:\Users\Chaitanya\Notes')

Aditi Ma’am: This tells Python: “Don’t look for escape characters. Just print exactly what I typed.”

Multiline Strings (The “Triple Quote”)

Aditi Ma’am: If you have a huge block of text—like a letter to parents—you don’t want to type \n at the end of every line. Use Triple Quotes '''.

Python

letter = '''Dear Parents,
The school will be closed on Monday due to 
the "Server Upgrade" project.
Regards,
Aditi Ma'am'''

Chaitanya: It kept the line breaks exactly as I typed them!

Indexing and Slicing Strings

Aditi Ma’am: Remember lists? team[0] gave you the first player? Strings work the exact same way. Think of a string as a List of Characters.

Python

spam = 'Hello world!'
spam[0]  # 'H'
spam[4]  # 'o'
spam[-1] # '!'
spam[0:5] # 'Hello'

Chaitanya: Can I change a character? spam[0] = 'J'?

Aditi Ma’am: No! Strings are Immutable (unchangeable). You cannot change an existing string. You have to create a new one.

Python

spam = 'J' + spam[1:] # Creates 'Jello world!'

The in and not in Operators

Aditi Ma’am: Just like checking if a student is in a list, you can check if a substring is in a string.

Python

>>> 'Hello' in 'Hello World'
True
>>> 'Chaitanya' in 'Hello World'
False

The upper(), lower(), and title() Methods

Aditi Ma’am: Now, let’s fix your messy data problem.

  1. upper(): CONVERTS TO ALL CAPS.
  2. lower(): converts to all lowercase.
  3. title(): Capitalizes The First Letter Of Each Word.

Chaitanya: So for the ID cards, I can just force everything to be uniform?

Python

name = '  chAiTanYa  '
clean_name = name.strip().upper()

Aditi Ma’am: Yes! And lower() is crucial for Search. If a user types “Exit”, “EXIT”, or “exit”, you want the program to understand all of them.

Python

response = input()
if response.lower() == 'yes':
    print('Confirmed.')

The isX() Methods (The Input Police)

Aditi Ma’am: Chaitanya, you mentioned someone typed “Eight” instead of “8” for their phone number. You can prevent that using Validation Methods. These return True or False.

  • isalpha(): Letters only ('ABC'). No numbers, no spaces.
  • isalnum(): Letters and numbers only ('A1').
  • isdecimal(): Numbers only ('123').
  • isspace(): Only whitespace (spaces, tabs, newlines).
  • istitle(): Title Case ('Hello World').

Chaitanya: So I can write a loop that forces them to enter a number?

Python

while True:
    print('Enter your age:')
    age = input()
    if age.isdecimal():
        break
    print('Please enter a number, not text.')

Aditi Ma’am: Exactly. Never trust user input. Always validate it.

join() and split() (The Converters)

Aditi Ma’am: Sometimes you need to convert a List to a String, or a String to a List.

  • join(): Glues a list together.
  • split(): Chops a string apart.

Example 1: The ID Card Printer (join)

Python

teams = ['Red', 'Blue', 'Green']
print(', '.join(teams))

Output: Red, Blue, Green

Example 2: The Data Parser (split) Aditi Ma’am: Imagine you download a CSV file where data is separated by commas: "Chaitanya,15,Red".

Python

data = 'Chaitanya,15,Red'
items = data.split(',')

Result: ['Chaitanya', '15', 'Red'] (Now it’s a list!)

Chaitanya: split() is basically the “Text-to-Columns” feature in Excel!

Aditi Ma’am: Precisely. And by default, split() splits by whitespace, which is great for counting words in a sentence.

Justifying Text (The rjust, ljust, center)

Chaitanya: Ma’am, my report card output looks messy because the names are different lengths. The grades aren’t aligning.

Alice 90
Christopher 85
Bob 92

Aditi Ma’am: You need to Pad the text so they all take up the same amount of space. Use rjust() (Right Justify) or ljust() (Left Justify).

Python

print('Alice'.ljust(15) + '90')
print('Christopher'.ljust(15) + '85')

Output:

Alice          90
Christopher    85

Aditi Ma’am: It adds spaces to the right of ‘Alice’ until the string is 15 characters long. Now everything lines up perfectly.

Removing Whitespace (strip, rstrip, lstrip)

Aditi Ma’am: This is the most important cleaning tool.

  • strip(): Removes whitespace from both ends.
  • lstrip(): Removes from the Left.
  • rstrip(): Removes from the Right.

Python

name = '   Chaitanya   '
clean = name.strip() # 'Chaitanya'

Aditi Ma’am: Always .strip() user input immediately. You don’t want your database to fail just because someone accidentally hit the Spacebar after typing their name.

The pyperclip Module (The Clipboard)

Aditi Ma’am: Finally, let’s automate the most boring task of all: Copy and Paste. Python can read your clipboard!

Chaitanya: You mean Ctrl+C and Ctrl+V?

Aditi Ma’am: Yes. You need to install it first (pip install pyperclip), but once you have it:

Python

import pyperclip
pyperclip.copy('Hello School!')
text = pyperclip.paste()

Project Idea: You can write a script that takes a messy list of names from your clipboard, cleans them up (strips spaces, fixes capitalization), and copies the clean list back to your clipboard instantly.

Summary Box

  • Escape Characters: \n (New Line), \' (Quote).
  • Raw Strings: r'Text' (Ignores backslashes).
  • Indexing: Strings work like Lists (text[0]).
  • Case Methods: upper(), lower(), title().
  • Check Methods: isalpha(), isdecimal().
  • Converters: join() (List → String), split() (String → List).
  • Formatting: rjust(), ljust(), center().
  • Cleaning: strip() removes whitespace.

Aditi’s Pro-Tip: “90% of data science is just cleaning messy text. Master split() and strip(), and you have mastered the basics of data wrangling.”


Leave a Comment

💬 Join Telegram