Ask Anything Monday - Weekly Thread

Hi I am working with some SMS data. That is not formatted very well. Essentially, the data is like this in a .txt

<sms protocol="0" address="Verizon Wireless" date="1543372916305" type="1" subject="null" body="FREE VZW MSG: To complete this activation, restart your device now." toa="null" sc_toa="null" service_center="null" read="1" status="-1" locked="0" date_sent="1543372916000" readable_date="Nov 27, 2018 8:41:56 PM" contact_name="(Unknown)" />

<sms protocol="0" address="Verizon Wireless" date="1543372916305" type="1" subject="null" body="FREE VZW MSG: To complete this activation, restart your device now." toa="null" sc_toa="null" service_center="null" read="1" status="-1" locked="0" date_sent="1543372916000" readable_date="Nov 27, 2018 8:41:56 PM" contact_name="(Unknown)" />

I hope I formatted this correctly. But I want to search through each one of these entries < > and grab 3 fields: (1) the body; (2) the readable_date; and (3) the contact_name. Do I need to approach this with RegEx? Or is there a way that is better you recommend?

/r/learnpython Thread