Eric Landry
New member
- Joined
- May 24, 2016
- Messages
- 8
- Reaction score
- 0
- Points
- 0
Hi,
Probably an easy one for an experienced PQ pro.
I'm downloading a CSV file like the one below, but with several thousand rows, and need to eliminate all but the most recent record for each ticker as indicated by the datekey column.
There are more than one row for each ticker (not always, but sometimes), and I only want to keep the most recent. So, for example, there are two BBY rows (datekey 10/31/2015 and 1/31/2016), and I only want the latest one (1/31/16).
Anybody have an idea for how to go about cleaning this data?
Thanks
Probably an easy one for an experienced PQ pro.
I'm downloading a CSV file like the one below, but with several thousand rows, and need to eliminate all but the most recent record for each ticker as indicated by the datekey column.
ticker | dimension | datekey | assetsc | liabilities | shareswa | shareswadil | debt | cashneq |
AAPL | MRQ | 12/26/2015 | 76219000000 | 1.65017E+11 | 5558930000 | 5594127000 | 62963000000 | 16689000000 |
AMZN | MRQ | 12/31/2015 | 35705000000 | 51363000000 | 467000000 | 480000000 | 8227000000 | 15890000000 |
BBY | MRQ | 10/31/2015 | 11735000000 | 10525000000 | 344700000 | 349000000 | 1639000000 | 1697000000 |
BBY | MRQ | 1/30/2016 | 9886000000 | 9141000000 | 339300000 | 342000000 | 1734000000 | 1976000000 |
FB | MRQ | 12/31/2015 | 21652000000 | 5189000000 | 2824000000 | 2868000000 | 114000000 | 4907000000 |
IBM | MRQ | 12/31/2015 | 42504000000 | 96071000000 | 969578092 | 972801068 | 39889000000 | 7686000000 |
KSS | MRQ | 10/31/2015 | 6203000000 | 9424000000 | 194497789 | 5135000000 | 501000000 | |
KSS | MRQ | 1/30/2016 | 5076000000 | 8115000000 | 189820241 | 4708000000 | 707000000 | |
MSFT | MRQ | 12/31/2015 | 1.27812E+11 | 1.03318E+11 | 7964000000 | 8028000000 | 44429000000 | 1.0264E+11 |
MSFT | MRQ | 3/31/2016 | 1.28421E+11 | 1.07063E+11 | 7895000000 | 7985000000 | 46394000000 | 1.05552E+11 |
IBM | MRQ | 3/31/2016 | 47623000000 | 1.03784E+11 | 961700000 | 964400000 | 45557000000 | 14354000000 |
AAPL | MRQ | 3/26/2016 | 87592000000 | 1.7482E+11 | 5514381000 | 5540886000 | 79872000000 | 21514000000 |
FB | MRQ | 3/31/2016 | 23812000000 | 4925000000 | 2843000000 | 2888000000 | 0 | 6456000000 |
AMZN | MRQ | 3/31/2016 | 30513000000 | 46372000000 | 471000000 | 481000000 | 8219000000 | 12470000000 |
Anybody have an idea for how to go about cleaning this data?
Thanks