what's the fastest way to sort/filter/sum increasing digits of numbers in a list
An answer to this question on Stack Overflow.
Question
example:
L = [12,14,22,41,21,23]
and I want the result to be:
R == [12,14,22,23]
The digits of the number must be in an increasing order, following are my solutions and they both work, but they both are too slow.
What's the fastest way of sorting it?
solution one:
R = filter(lambda j: int(''.join(sorted(str(j))))==j , L)
solution two:
for j in L:
if int(''.join(sorted(str(j))))==j:
R.append(j)
Question 2 - additionally, I want the sum of adding these corresponding digits equal to 5.
Here are my solutions, again, they do work but are too slow.
So what is the fastest way of doing it.
newR_should_be == [14,23]
one:
newR = filter(lambda i: sum([int(x) for x in str(i)])==5 ,R)
two:
for i in R:
if sum([int(x) for x in str(i)])==5:
newR.append(i)
Any help would be appreciated.
Answer
This is one reason I don't like Python: the simple solution you propose, which would work well enough in most languages, is probably as slow as molasses in Python. Too many high-level operations.
But we can work around this by using Numpy as follows:
#!/usr/bin/env python3
import numpy as np
COUNT = 1000
MAXNUM = 100000
MAXDIGITS = 6
prevremain = 99*np.ones(COUNT) #The previous digit we removed from the RHS
origdata = np.random.randint(0,MAXNUM,size=COUNT) #Original data
quot = origdata.copy() #What's left when we remove a digit from the RHS
good = np.ones(COUNT) #Whether the number's digits are monotonically decreasing from right to left
for i in range(1,MAXDIGITS): #Pull digits off of the numbers one at a time
quot, remain = np.divmod(quot,10) #quot is the rest of the number, remain is the right-most digit
#Check to see if this digit was smaller, or equal to, than the last one we
#removed. NOTE: If you have numbers with an unequal number of digits, you'll
#need to think carefully about whether `<=` might work better for you.
good = np.logical_and(good, (remain < prevremain))
prevremain = remain
#These are the numbers you want
print(origdata[good])
Numpy offloads more of the computation to low-level libraries which work quickly. In this case, it can use vectorized math operations to perform digit checks across your entire input quickly. You can then use these as a filter.