Bug 24067 - python-lxml new security issue CVE-2018-19787
Summary: python-lxml new security issue CVE-2018-19787
Status: RESOLVED FIXED
Alias: None
Product: Mageia
Classification: Unclassified
Component: Security (show other bugs)
Version: 6
Hardware: All Linux
Priority: Normal major
Target Milestone: ---
Assignee: QA Team
QA Contact: Sec team
URL:
Whiteboard: MGA6-32-OK MGA6-64-OK
Keywords: advisory, validated_update
Depends on:
Blocks:
 
Reported: 2018-12-25 21:52 CET by David Walser
Modified: 2018-12-31 23:43 CET (History)
6 users (show)

See Also:
Source RPM: python-lxml-3.8.0-1.1.mga6.src.rpm
CVE:
Status comment:


Attachments

Description David Walser 2018-12-25 21:52:42 CET
Fedora has issued an advisory on December 21:
https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/thread/3RVMDZTRGFNPQRD6MD74QL2A5IOBPFXQ/

The issue is fixed upstream in 4.2.5.
Comment 1 David Walser 2018-12-26 02:10:10 CET
Ubuntu has issued an advisory for this on December 10:
https://usn.ubuntu.com/3841-1/
Comment 2 David GEIGER 2018-12-26 05:48:56 CET
Fixed for mga6!

CC: (none) => geiger.david68210

Comment 3 Marja Van Waes 2018-12-26 08:01:00 CET
(In reply to David GEIGER from comment #2)
> Fixed for mga6!

Thanks David :-)

Does someone have time to write an advisory?

Assigning to the Python stack maintainers, CC'ing the registered maintainer.

Assignee: bugsquad => python
CC: (none) => makowski.mageia, marja11

Comment 4 David Walser 2018-12-26 15:57:18 CET
Advisory:
========================

Updated python-lxml packages fix security vulnerability:

An issue was discovered in lxml before 4.2.5. lxml/html/clean.py in the
lxml.html.clean module does not remove javascript: URLs that use escaping,
allowing a remote attacker to conduct XSS attacks, as demonstrated by
"j a v a s c r i p t:" in Internet Explorer (CVE-2018-19787).

References:
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-19787
https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/thread/3RVMDZTRGFNPQRD6MD74QL2A5IOBPFXQ/
========================

Updated packages in core/updates_testing:
========================
python2-lxml-4.2.5-1.mga6
python3-lxml-4.2.5-1.mga6
python-lxml-docs-4.2.5-1.mga6

from python-lxml-4.2.5-1.mga6.src.rpm

Assignee: python => qa-bugs

Comment 5 Lewis Smith 2018-12-28 11:19:47 CET
Pointers:
- From CVE refs, the only useful one is
  https://lists.debian.org/debian-lts-announce/2018/12/msg00001.html
  "LXML did not remove "javascript:" URLs that used escaping such as
  "j a v a s c r i p t". This is a similar issue to CVE-2014-3146."
- Of the bugs references, only #13326 is relevant (for the CVE above)..
- https://bugs.mageia.org/show_bug.cgi?id=13326#c9
Claire again! She gives an example for the old CVE we should be able to adapt here.

No time now to continue just now.

CC: (none) => lewyssmith

Comment 6 Herman Viaene 2018-12-29 12:11:21 CET
MGA6-32 MATE on IBM Thinkpad R50e
No installation issues
Followed lead of older bug as mentioned above, copying literally:
$ python
Python 2.7.15 (default, May  1 2018, 17:07:49) 
[GCC 5.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml.html.clean import clean_html
>>> 
>>> html = '''\
... <html>
... <body>
... <a href="javascript:alert(0)">
... aaa</a>
... <a href="javas\x01cript:alert(1)">bbb</a>
... <a href="javas\x02cript:alert(1)">bbb</a>
... <a href="javas\x03cript:alert(1)">bbb</a>
... <a href="javas\x04cript:alert(1)">bbb</a>
... <a href="javas\x05cript:alert(1)">bbb</a>
... <a href="javas\x06cript:alert(1)">bbb</a>
... <a href="javas\x07cript:alert(1)">bbb</a>
... <a href="javas\x08cript:alert(1)">bbb</a>
... <a href="javas\x09cript:alert(1)">bbb</a>
... </body>
... </html>'''
>>> 
>>> print clean_html(html)
<div>
<body>
<a href="">
aaa</a>
<a href="">bbb</a>
<a href="">bbb</a>
<a href="">bbb</a>
<a href="">bbb</a>
<a href="">bbb</a>
<a href="">bbb</a>
<a href="">bbb</a>
<a href="">bbb</a>
<a href="">bbb</a>
</body>
</div>
>>> quit()
So at least this woks as before.
Tried the same for python3, following Claire's note on the print command (not copying all previous html commands here, they are exacltly the same, but at the end:
>>> 'print (clean_html(html))'
'print (clean_html(html))'
>>> 
and nothing is displayed.
Waiting for Lewis' comments.

CC: (none) => herman.viaene

Comment 7 Lewis Smith 2018-12-30 21:53:04 CET
Testing M6 x64

Somehow a pkg name mutates from 'python-lxml' to 'python2-lxml'.

BEFORE update: python-lxml-3.8.0-1.1.mga6, python3-lxml-3.8.0-1.1.mga6
? $ rpm -q python2-lxml
package python2-lxml is not installed

Trying Claire's script slightly tweaked re c5 example:-
 $ python
Python 2.7.15 (default, May  1 2018, 17:08:05) 
[GCC 5.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml.html.clean import clean_html
>>> html = '''\
... <html>
... <body>
... <a href="javascript:alert(0)">
... aaa</a>
... <a href="j a v a s c r i p t:alert(1)">bbb</a>
... <a href="j a v a s c r i p t:alert(1)">ccc</a>
... <a href="j a v a s c r i p t:alert(1)">ddd</a>
... </body>
... </html>'''
>>> print clean_html(html)
<div>
<body>
<a href="">
aaa</a>
<a href="">bbb</a>
<a href="">ccc</a>
<a href="">ddd</a>
</body>
</div>
 Alas - it gets properly cleaned. Try again:
>>> html = '''\
... <html>
... <body>
... <a href="javascript:alert(0)">
... aaa</a>
... <a href="j a v a \01s c r i p t:alert(1)">bbb</a>
... <a href="j a v a \02s c r i p t:alert(1)">ccc</a>
... <a href="j a v a \03s c r i p t:alert(1)">ddd</a>
... </body>
... </html>'''
>>> print clean_html(html)
<div>
<body>
<a href="">
aaa</a>
<a href="">bbb</a>
<a href="">ccc</a>
<a href="">ddd</a>
</body>
</div>
 All correct.
 $ python3
Python 3.5.3 (default, May 23 2018, 14:20:56) 
[GCC 5.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml.html.clean import clean_html
>>> html = '''\
... <html>
... <body>
... <a href="javascript:alert(0)">aaa</a>
... <a href="j a v a \01s c r i p t:alert(1)">bbb</a>
... <a href="j a v a \02s c r i p t:alert(1)">ccc</a>
... <a href="j a v a \03s c r i p t:alert(1)">ddd</a>
... </body>
... </html>'''
>>> print (clean_html(html))
<div>
<body>
<a href="">aaa</a>
<a href="">bbb</a>
<a href="">ccc</a>
<a href="">ddd</a>
</body>
</div>
 which again is correct. So I cannot reproduce the example fault.
----------------------------------------------------------------
AFTER update: The update list now shows python2-lxml.
- python2-lxml-4.2.5-1.mga6.x86_64
- python3-lxml-4.2.5-1.mga6.x86_64
? $ rpm -q python-lxml
package python-lxml is not installed

BTAIM Results were identical (& correct) to before the update for Python[2] & Python3.
-----------------
@ Herman
> but at the end:
> >>> 'print (clean_html(html))'
> 'print (clean_html(html))'
> >>> 
> and nothing is displayed.
Doubtless the bounding quotes altered things.

We both agree that the updated pkgs behave as they should, even if they did so already. So OKs all round, validation, advisory from c4.

Keywords: (none) => advisory, validated_update
Whiteboard: (none) => MGA6-32-OK MGA6-64-OK
CC: (none) => sysadmin-bugs

Comment 8 Mageia Robot 2018-12-31 23:43:15 CET
An update for this issue has been pushed to the Mageia Updates repository.

https://advisories.mageia.org/MGASA-2018-0497.html

Status: NEW => RESOLVED
Resolution: (none) => FIXED


Note You need to log in before you can comment on or make changes to this bug.