Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ascii SUBJECT and FROM turns into " =?utf-8?b?" even when charset is set to iso-8859-2 #126

Open
zoltan-fedor opened this issue Mar 18, 2016 · 3 comments

Comments

@zoltan-fedor
Copy link

Hi,
So this is an interesting one.
I am trying to send email with a non-ascii subject line (Hungarian). The problem is that at the first non-ascii character the " =?utf-8?" string gets added - which is not a problem, but when the email client - both GMail and Yahoo Mail displays it that additional space is left there. So basically at random places there are spaces added to the subject line.

Settings the encoding specifically to iso-8859-2 doesn't help, as it seems the UTF8 encodeded version always takes precendence.

Sending the email like this:

    app = current_app._get_current_object()
    msg = Message(subject='Sikeres regisztracio es belepo a SF Kiallitasra (belépő)',
                              sender=('SF Biztonságtechnikai Kiállítás & Konferencia', '[email protected]'),
                              recipients=[('Fedoráéő Zoltan' , '[email protected]')],
                              charset='iso-8859-2', bcc=[bcc])
    msg.body = render_template((template + '.txt'), **kwargs).encode('iso-8859-2')
    msg.html = render_template((template + '.html'), **kwargs).encode('iso-8859-2')
    print(msg.charset)
    thr = Thread(target=send_async_email, args=[app, msg])
    thr.start()

This is what flask-mail generates:
Content-Type: multipart/mixed; boundary="===============2657123796784931499=="\r\nMIME-Version: 1.0\r\nSubject: Sikeres regisztracio es belepo az SF Kiallitasra (b\r\n =?utf-8?b?ZWzDqXDFkSk=?=\r\nFrom: SF =?utf-8?q?Biztons=C3=A1gtechnikai?=\r\n =?utf-8?b?S2nDoWxsw610w6Fz?= & Konferen cia <[email protected]>\r\nTo: =?iso-8859-2?q?Fedor=E1=E9=F5_Zoltan?= <[email protected]>\r\nDate: Fri, 18 Mar 2016 22:50:40 +0100\r\nMessage-ID: <[email protected]>\r\n\r\n--===============2657123796784931499==\r\nContent-Type: multipart/alternative;\r\n boundary="===============5571071455436956408=="\r\nMIME-Version: 1.0\r\n\r\n--===============5571071455436956408==\r\nContent-Type: text/plain; charset="iso-8859-2"\r\nMIME-Version: 1.0\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\nTisztelt Fedor=E1=E9=F5 Zoltan etc

As you can see, the funny thing is that the TO field correctly encoded with iso-8859-2, but the Subject and From is still encoded with utf-8. Those encodings are fine (if the client knows utf-8), but they added an additional " =?utf-8?", so after the client encoding it it will have an additional space/loses a space.

So in GMail and Yahoo Mail this will show like:

From: SF BiztonságtechnikaiKiállítás & Konferen cia ([email protected])
To: Fedoráéő Zoltan ([email protected])
Subject: Sikeres regisztracio es belepo a SF Kiallitasra (b elépő)

  • observe the additional space in the subject "(b elépő)" and the additional and lost spaces in the "FROM" field.

If you look at the header above, then you can see that the reason why the TO is displaying correctly and the FROM and SUBJECT doesn't that the "\r\n =?utf-8?b?" is inserted mid-text and that additional space is sometimes adds a text to display, sometimes taken away.

Interesting thing is that the flask_mail.py file's sanitize_subject() method correctly returns the iso-8859-r encoded version of the subject, but what makes into the email header is that utf8 encoded MIME version.

Any idea how to actually use Flask-Mail to send international characters in the FROM and SUBJECT field without additional / lost spaces?

@zoltan-fedor
Copy link
Author

It seems the additional space in the subject is because of reaching the 78 character limit of the subject line where a line break with a space is added, that seems to be part of the standard (althoughtsurprising that GMAIL and Yahoo Mail both would not follow the standard, but whatever).
It is still a question why spaces are taken away from the FROM and why utf8 is being used, when charset=iso-8859-2 is specified explicitely.

@zoltan-fedor zoltan-fedor changed the title Non-ascii subject turns into "\n\r =?utf-8?b?" even when encoding is set Non-ascii SUBJECT and FROM turns into "\n\r =?utf-8?b?" even when charset is set to iso-8859-2 Mar 18, 2016
@zoltan-fedor zoltan-fedor changed the title Non-ascii SUBJECT and FROM turns into "\n\r =?utf-8?b?" even when charset is set to iso-8859-2 Non-ascii SUBJECT and FROM turns into " =?utf-8?b?" even when charset is set to iso-8859-2 Mar 18, 2016
@zoltan-fedor
Copy link
Author

ps, I am using Python 3.5.1

@mybudderz
Copy link

no results for From: Snapshot-Content-Location: https://github.com/ThomasOPMarks/Blind-Person-Running/blob/c73130ab92281eea3a9c5a9111cb0bac6dbb1fa0/Blind%20Person%20Running%20App.xcodeproj/xcuserdata/student.xcuserdatad/xcdebugger/Breakpoints_v2.xcbkptlist Subject: =?utf-8?Q?Blind-Person-Running/Breakpoints_v2.xcbkptlist=20at=20c73130ab9?= =?utf-8?Q?2281eea3a9c5a9111cb0bac6dbb1fa0=20=C2=B7=20ThomasOPMarks/Blind-?= =?utf-8?Q?Person-Running?= Date: Fri, 5 Jul 2018 12:35:49 -0000 MIME-Version: 1.0 Content-Type: multipart/related; type="text/html"; boundary="----MultipartBoundary--4usT75YIKXQRN4UA5UYAM0zUtLP1H7Ed0YYb1Wx5PN----" ------MultipartBoundary--4usT75YIKXQRN4UA5UYAM0zUtLP1H7Ed0YYb1Wx5PN---- Content-Type: text/html Content-ID: [email protected] Content-Transfer-Encoding: binary Content-Location: https://github.com/ThomasOPMarks/Blind-Person-Running/blob/c73130ab92281eea3a9c5a9111cb0bac6dbb1fa0/Blind%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants