Fun programming: this post is based on a certain Project named for a famous mathematician (Euler), but the exact problem is different than the programming question posted there because it is considered bad form to spoil the solutions. This text should be unattractive to Google searches and people looking to spoil the fun of figuring out a hard problem without help.
I'll discuss a related problem which has different mathematical properties but the similar programming challenges. The original problem is better known as Happy Numbers. The modification for today's discussion is a variation called Happy Cube Numbers which has the following new properties: A positive integer i is represented in base-10 and its digits are individually summed and cubed to produce another integer j. The following definition captures this function:
let
\[
i=\sum_{k=1}^{\left\lceil \log_{10}i\right\rceil }10^{k-1}d_{k}
\]
and define
\[
f\left(i\right)=\sum_{k}d_{k}^{3}
\]
If f(i)=i, then i is a Happy Cube Number.
For example,
\[
i=371=10^{2}3+10^{1}7+10^{0}1
\]
Which is a Happy Cube Number because:
\[
f(371)=3^{3}+7^{3}+1^{3}=371
\]
Given this definition, we can already write some code and find some Happy Cube Numbers:
def sumCubedDigits(N):
mysum=0L
while N > 0:
mysum += (N%10)**3
N //= 10
return mysum
This python code is very fast, O(log10(i)), and we can use it to exhaustively search for 5-digit Happy Cube Numbers:
print [i for i in range(100000L) if sumCubedDigits(i)==i]
Output:
[0, 1, 153, 370, 371, 407]
From these results, we might conjecture that there are no Happy Cube Numbers above 407. Indeed, this is the case because f(i) < i for most i. Consider all 5-digit integers:
\[
i=[d_{5}d_{4}d_{3}d_{2}d_{1}]_{10}=10^{4}d_{5}+10^{3}d_{4}+10^{2}d_{3}+10^{1}d_{2}+10^{0}d_{1}\leq10^{5}
\]
When running these integers through f:
\[
f\left(i\right)=d_{5}^{3}+d_{4}^{3}+d_{3}^{3}+d_{2}^{3}+d_{1}^{3}\leq5\cdot9^{3}
\]
The latter inequality follows because each of the 5 digits can be at most 9. So f(i) is much smaller than i for most integers, and f(i) cannot be equal to i for any number with 4 or more digits. Therefore, the only Happy Cube Numbers are [0, 1, 153, 370, 371, 407], as stated earlier.
There is one other interesting behavior to note: some infinitely repeating cycles of integers can occur in Happy Cube sequences. Consider:
print [int(sumCubedDigits(i)) for i in [133, 55, 250, 133]]
Output:
[55, 250, 133, 55]
Therefore, this cycle will repeat forever. How can other cycles be identified? Writing some code provides a relatively quick solution. For example, two-cycle pairs of integers can be identified by applying f() twice and excluding the trivial solutions (immediately repeating numbers).
print [(i,int(sumCubedDigits(i))) for i in range(10000L) if sumCubedDigits(sumCubedDigits(i))==i and sumCubedDigits(i)!=i ]
Output:
[(136, 244), (244, 136), (919, 1459), (1459, 919)]
Bigger integers can't be part of a two-integer cycle because f() decreases too rapidly.
Applying similar logic, we can find three-integer cycles:
s3 = sumCubedDigits
print [(i,int(s3(i)),int(s3(s3(i)))) for i in range(10000L) if s3(s3(s3(i)))==i and s3(s3(i))!=i and s3(i)!=i ]
Note that I took advantage of python's true object nature to rename the function to s3 to make the command above easier to read. Output:
[(55, 250, 133), (133, 55, 250), (160, 217, 352), (217, 352, 160), (250, 133, 55), (352, 160, 217)]
So if we remove the redundant entries, there are two cycles of length three: (55, 250, 133) and (160, 217, 352), two cycles of length two: (136, 244) and (919, 1459), and five immediately repeating positive integers: 1, 153, 370, 371, 407. Since every iteration of f will rapidly decrease large values of i, we can conjecture that all integers will terminate in one of the above 9 situations.
How do we know that cycles of length four or longer don't exist? Exhaustively searching 4-digit integers with code similar to s3 above shows that such cycles don't occur. A different way to think about it is that this entire exercise for 4-digit integers is trying to solve the following Diophantine equation:
\[
10^{3}d_{4}+10^{2}d_{3}+10d_{2}+d_{1}=d_{4}^{3}+d_{3}^{3}+d_{2}^{3}+d_{1}^{3}
\]
Unfortunately, Diophantine equations are very hard to solve. Brute force solutions are often the only ones available. So let's get back to programming. Among all of the 4-digit integers, how many of them will reach each of the 9 situations? How can we efficiently find the final value or cycle from iterating this process on a given integer?
The first programming tool we'll need for an efficient solution is a form of caching because there is no point in continuing to iterate when we've already found the final value or entered a cycle. Memoization solves the caching problem by avoiding a recalculation of an integer we've previously computed. Here's a straightforward class decorator which takes advantage of the fact that Python objects are true classes:
class memoizer(object):
'''Decorator. Caches results'''
def __init__(self,fn):
self.fn = fn
self.cache = {}
def __call__(self,*args):
if args in self.cache:
return self.cache[args]
else:
val = self.fn(*args)
self.cache[args] = val
return val
The next tool we need is some basic recursion and datatypes. We'll (ab)use the python dictionary to get what we need:
@memoizer
def IterateCubedDigits(N):
return IterateCubedDigits(int(sumCubedDigits(N)))
knownValues = {
(1,): (1),
(55,): (55, 250, 133),
(133,): (55, 250, 133),
(136,): (136, 244),
(153,): (153),
(160,): (160, 217, 352),
(217,): (160, 217, 352),
(244,): (136, 244),
(250,): (55, 250, 133),
(352,): (160, 217, 352),
(370,): (370),
(371,): (371),
(407,): (407),
(919,): (919, 1459),
(1459,): (919, 1459)
}
IterateCubedDigits.cache = knownValues.copy()
By only storing the final result, we can avoid using too much memory. The cache uses O(N) entries for N integers, and lookup times will be log(N). That's a good tradeoff because otherwise, we'd have to iterate many more times to find the final value of each integer.
Finally, we can ask where a given integer goes. E.g., 1337:
print IterateCubedDigits(1337)
Output: [371].
So what happens to the first 20 integers?
list1 = [(i,IterateCubedDigits(i)) for i in range(1,21)]
print '\n'.join(map(str, list1))
Output:
(1, 1)
(2, 371)
(3, 153)
(4, (55, 250, 133))
(5, 371)
(6, 153)
(7, 370)
(8, 371)
(9, 153)
(10, 1)
(11, 371)
(12, 153)
(13, (55, 250, 133))
(14, 371)
(15, 153)
(16, (160, 217, 352))
(17, 371)
(18, 153)
(19, 370)
(20, 371)
371 and 370 look like popular final values for this process. Let's create a histogram out of the first thousand integers.
d={}
list1000 = [(i,IterateCubedDigits(i)) for i in range(1,1001)]
for (k1,v1) in list1000:
if v1 in d:
d[v1] = d[v1]+1
else:
d[v1] = 1
hist = sorted(d.items(), key=lambda x: x[1], reverse=True)
print '\n'.join(map(str,hist))
This code requires a little more explanation. Having generated an explicit list of the final values corresponding to each integer, a dictionary is constructed, and it counts the number of values in each of the final values (either a cycle or a single integer). We then sort it by a descending number of counts and print pairs consisting of the final value and the count.
Output:
(153, 333)
(371, 303)
(370, 174)
((55, 250, 133), 66)
((160, 217, 352), 51)
(407, 30)
((919, 1459), 24)
(1, 10)
((136, 244), 9)
It looks like my intuition was wrong, at least for the integers up to 1000. 153 is actually the most common final value (33.3%), as compared to 371 at 30.3% and 370 at 17.4%.
Going up to 6-digit numbers, those percentages are similar: 153 is 33.3%, 371 is 29.5%, and 370 is 18.1%. Surprisingly, the (136, 244) cycle is less popular than having 1 as a final value.
Output:
(153, 333333)
(371, 295281)
(370, 181041)
((55, 250, 133), 53370)
((160, 217, 352), 39195)
(407, 38052)
((919, 1459), 32274)
(1, 14953)
((136, 244), 12501)
The execution time of this algorithm was under 2 seconds for computing these values for the first million integers. A slower algorithm would surely fail under these conditions.
Conclusions: I explored an interesting number theory problem and produced some concise python code which solves the problem very quickly. Algorithmic and mathematical analyses were used to produce correct results with minimal computation.