Fundamentals 6 min read

Mastering the Linux “cut” Command: Byte, Character, and Field Extraction Techniques

This guide explains how to use the Linux cut command to extract specific bytes, characters, or fields from text files, detailing common options, practical examples, and pitfalls such as handling multibyte characters and custom delimiters.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Mastering the Linux “cut” Command: Byte, Character, and Field Extraction Techniques

Command Overview

The cut command trims and extracts portions of each line from a file or standard input based on specified criteria. It is often combined with other commands to pull out exactly the data you need.

Basic Syntax

cut [options] file

Key options include: -b: split by bytes -c: split by characters -d: define a custom delimiter (default is a tab) -f: select fields (used with -d) -n: suppress splitting of multibyte characters (used with -b)

Byte‑Based Extraction

Given a file test1.txt containing the days of the week, the following examples demonstrate byte selection:

1. First byte of each line

[alvin@VM_0_16_centos cut]$ cut -b 1 test1.txt
M
T
W
T
F
S
S

2. Bytes 2, 4, 6

[alvin@VM_0_16_centos cut]$ cut -b 2,4,6 test1.txt
ody
usa
ens
hrd
rdy
aud
udy

3. Byte range 3‑6

[alvin@VM_0_16_centos cut]$ cut -b 3-6 test1.txt
nday
esda
dnes
ursd
iday
turd
nday

4. Bytes 1‑3 and 6

[alvin@VM_0_16_centos cut]$ cut -b 1-3,6 test1.txt
Mony
Tuea
Weds
Thud
Friy
Satd
Suny

5. First three bytes or everything after the third byte

[alvin@VM_0_16_centos cut]$ cut -b -3 test1.txt
Mon
Tue
Wed
Thu
Fri
Sat
Sun
[alvin@VM_0_16_centos cut]$ cut -b 3- test1.txt
nday
esday
dnesday
ursday
iday
turday
nday

Character‑Based Extraction

Bytes and characters differ for multibyte text. For example, the Chinese character occupies multiple bytes but counts as a single character. Use -c for characters or -nb to treat multibyte characters as single units.

Given test2.txt with Chinese lines, extracting the second character works as follows:

[alvin@VM_0_16_centos cut]$ cut -nb 2 test2.txt
许
望
灰
哥
[alvin@VM_0_16_centos cut]$ cut -c 2 test2.txt
许
望
灰
哥

Field‑Based Extraction

Fields are sections separated by a delimiter. Using /etc/passwd as an example, each line’s fields are delimited by colons ( :). To extract the first field (the username) from the first five lines:

[alvin@VM_0_16_centos cut]$ cat /etc/passwd | head -n 5 | cut -d : -f 1
root
bin
daemon
adm
lp

The delimiter can be changed with -d to commas, spaces, or any other character, but note that cut cannot handle multiple consecutive delimiters reliably, which is a limitation of the command.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

text processingcut
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.