Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > alt.comp.software.thunderbird > #19694

Re: Copying the text in subject of multiple messages?

From Dave Royal <dave@dave123royal.com>
Newsgroups alt.comp.software.thunderbird
Subject Re: Copying the text in subject of multiple messages?
Date 2026-02-16 15:06 +0000
Organization A noiseless patient Spider
Message-ID <10mvbpg$uhho$1@dont-email.me> (permalink)
References (5 earlier) <10mqdme$3cavb$1@dont-email.me> <10mqfqe$3cuuc$2@dont-email.me> <10mu8t0$jrn5$2@toylet.eternal-september.org> <10muk3q$n4fm$1@dont-email.me> <10muqb4$p5cb$1@dont-email.me>

Show all headers | View raw


On Mon, 16 Feb 2026 05:08:33 -0500, Paul wrote:

> On Mon, 2/16/2026 3:22 AM, Dave Royal wrote:
>> 
>> My first thought was to use sed to convert the mail file into csv
>>  - date, sender, subject - and then sort and select in a spreadsheet.
>>  The rfc2047 encoding complicates it.
>> 
>> I wonder if awk could do the conversion - between sed and the
>>  spreadsheet? People have programmed some remarkable things in awk.
>> 
>> Any script or language with an rfc2047 decoder could be used to
>>  write a little utility:
>> stdin > decode > stdout
>> 
>> I never learned C but I recently played with Rust:
>> https://crates.io/crates/rfc2047-decoder/1.1.0
> 
> The mbox file is a mix of character sets.
> This is not particularly amenable to hobby programming (I tried and
> failed). You really need a parser that reads the headers on each
> message, if you expect to access a body in an intelligent way.
> 
> The header lines at one time, would have been guaranteed to be ASCII.
> But gradually certain of the lines now also require flexible parsing to
> render the UTF-8 or whatever, properly. The Subject line could have
> emojis in it.
> And again, not all the Linux tools could handle that. Just as PERL may
> not be prepared to handle UTF-8 properly, or GAWK or SED for that
> matter.
> 
> If the Subject line has an escape sequence in it,
> your tools have to handle that.
> 
> Years ago, this would have been a doddle. Today,
> it takes a lot of thinking to do this right and make a result that is
> presentable.
> 
> A Subject: line during the ThaiSpam attack.
> 
>    Subject:
>    =?UTF-8?B?
4Lid4Liy4LiBIDEwIOC4o+C4seC4miAxMDAg4LiX4Liz4Lii4Lit4LiUIDIwMCDguJY=?=
> 
> How it looks in Thunderbird.
> 
>    [Picture]
> 
>     https://i.postimg.cc/HxzKH3qV/Spam-Attack-Subject-Lines.gif
> 
> Normal tools aren't really ready for that.
> 
> While some days, the Subject lines look "normal", your crafted software
> solution has to work with the worst case behavior.

The header lines are still all ASCII. Non-ASCII characters and emoticons 
are encoded like the example you quoted. sed or grep can easily extract 
just the (encoded) subject lines. Extracting date, from, and subject onto 
one csv line can be done with sed - you have to use the hold space to 
merge the 3 lines.

I do something similar with the bodies of emails from a supermarket 
(sainsbury's UK) confirming deliveries. I extract the item, quantity and 
price from a table in the email and turn it into a spreadsheet for ease of 
checking. The sequence is
email (eml) > sed (csv) > awk* (csv) > sort (csv) > LibreOfficeCalc
*awk adds a sort category column, eg dairy.

The email bodies are html-only UTF-8 (so no decoding required). LOCalc 
displays the UTF-8 text fine but when I moved it to Windows and Excel it 
didn't so I had to put it through an extra stage (another sed) to add a 
Byte Order Marker (BOM).

(I then automated it with a TB addon: display the email and it 
automatically generates a csv file.)
-- 
(Remove any numerics from my email address.)

Back to alt.comp.software.thunderbird | Previous | NextPrevious in thread | Find similar


Thread

Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-14 01:10 +0800
  Re: Copying the text in subject of multiple messages? Paul <nospam@needed.invalid> - 2026-02-13 16:10 -0500
    Re: Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-14 19:03 +0800
      Re: Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-14 19:11 +0800
        Re: Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-14 19:28 +0800
  Re: Copying the text in subject of multiple messages? "J. P. Gilliver" <G6JPG@255soft.uk> - 2026-02-14 02:48 +0000
    Re: Copying the text in subject of multiple messages? "Carlos E. R." <robin_listas@es.invalid> - 2026-02-14 10:23 +0100
    Re: Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-14 17:38 +0800
      Re: Copying the text in subject of multiple messages? "J. P. Gilliver" <G6JPG@255soft.uk> - 2026-02-14 15:13 +0000
        Re: Copying the text in subject of multiple messages? Paul <nospam@needed.invalid> - 2026-02-14 10:49 -0500
          Re: Copying the text in subject of multiple messages? Dave Royal <dave@dave123royal.com> - 2026-02-14 18:08 +0000
            Re: Copying the text in subject of multiple messages? Paul <nospam@needed.invalid> - 2026-02-14 13:44 -0500
              Re: Copying the text in subject of multiple messages? "Carlos E. R." <robin_listas@es.invalid> - 2026-02-15 00:55 +0100
              Re: Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-16 13:10 +0800
                Re: Copying the text in subject of multiple messages? Dave Royal <dave@dave123royal.com> - 2026-02-16 08:22 +0000
                Re: Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-16 17:58 +0800
                Re: Copying the text in subject of multiple messages? Paul <nospam@needed.invalid> - 2026-02-16 05:08 -0500
                Re: Copying the text in subject of multiple messages? "Mr. Man-wai Chang" <toylet.toylet@gmail.com> - 2026-02-16 18:38 +0800
                Re: Copying the text in subject of multiple messages? Dave Royal <dave@dave123royal.com> - 2026-02-16 15:06 +0000

csiph-web