HaskellでProject Euler(Problem 22～24)

今年もオイラー頑張っていくぞ！
プロジェクトオイラー用の共通ライブラリを作ったので、今回からこれを使っていくことにする。
今後も問題解いていく中で汎用的に使えそうなものはライブラリ側に切り出して充実させていきたい。

Problem 22. Names scores

Using names.txt (right click and 'Save Link/Target As...'), a 46K text file containing over five-thousand first names,
begin by sorting it into alphabetical order.
Then working out the alphabetical value for each name,
multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN,
which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the list.
So, COLIN would obtain a score of 938 × 53 = 49714.

What is the total of all the name scores in the file?
Problem 22. Names scores
5000個以上の名前が書かれている46Kのテキストファイル names.txt を用いる. まずアルファベット順にソートせよ.

のち, 各名前についてアルファベットに値を割り振り, リスト中の出現順の数と掛け合わせることで, 名前のスコアを計算する.

たとえば, リストがアルファベット順にソートされているとすると, COLINはリストの938番目にある.
またCOLINは 3 + 15 + 12 + 9 + 14 = 53 という値を持つ. よってCOLINは 938 × 53 = 49714 というスコアを持つ.

ファイル中の全名前のスコアの合計を求めよ.
Problem 22 「名前のスコア」

名前のリストはダブルクオートで囲まれカンマで区切られている。
Text.Regex.matchRegexAllは正規表現にマッチすると
「<マッチ前の文字列>, <マッチした文字列>, <マッチ後の文字列>, <部分式にマッチした文字列>」を
Maybeにくるんで返すので、これを再帰的に呼んで全ての名前を取り出すことにした。

工夫したのは正規表現を使って名前を取り出すところとファイル読み込みのためにdo式を使ったところくらいで、
後は素直に実装できたと思う。
アルファベットからスコアを算出するのには、'A'との差を取るようにした。

Problem 23. Non-abundant sums

A perfect number is a number for which the sum of its proper divisors is exactly equal to the number.
For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.

A number n is called deficient if the sum of its proper divisors is less than n and it is called abundant if this sum exceeds n.

As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16,
the smallest number that can be written as the sum of two abundant numbers is 24.
By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers.
However, this upper limit cannot be reduced any further by analysis even though
it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.

Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.
Problem 23. Non-abundant sums
完全数とは, その数の真の約数の和がそれ自身と一致する数のことである.
たとえば, 28の真の約数の和は, 1 + 2 + 4 + 7 + 14 = 28 であるので, 28は完全数である.

真の約数の和がその数よりも少ないものを不足数といい, 真の約数の和がその数よりも大きいものを過剰数と呼ぶ.

12は, 1 + 2 + 3 + 4 + 6 = 16 となるので, 最小の過剰数である. よって2つの過剰数の和で書ける最少の数は24である.
数学的な解析により, 28123より大きい任意の整数は2つの過剰数の和で書けることが知られている.
2つの過剰数の和で表せない最大の数がこの上限よりも小さいことは分かっているのだが, この上限を減らすことが出来ていない.

2つの過剰数の和で書き表せない正の整数の総和を求めよ.
Problem 23 「非過剰数和」

この問題からZaneli.Eulerモジュールを使用するようにした。
頭の中でロジックを組み立てるのは割と容易だったと思うのだけど、かなり苦戦した…。
結局自力では解けなかったなぁ…。

まず、以下のような処理を思いついた。

import Data.List ((\\))
import Zaneli.Euler (divSum)

main = print $ sum $ [1..maxValue] \\ abundants

-- 2つの過剰数の和で書き表せる値のリスト
--abundants :: [Integer]
abundants = let abundant = [n | n <- [minValue..maxValue], (divSum n) > n] in
            [ x + y | x <- abundant, y <- abundant, (x + y) <= maxValue]

minValue = 12
maxValue = 28123

2つの過剰数の和で表すことができる全ての数値をまず洗い出し、それらを除いたリストを作る。
しかしこれは要素数が大きすぎるのか「Stack space overflow: current size 8388608 bytes.」が発生する。

次に、以下のような処理を思いついた。

import Zaneli.Euler (divSum)

main = print $ foldl addNonAbds 0 [1..maxValue]
  where addNonAbds sum n | elem n abundants = sum
                         | otherwise        = sum + n

-- 2つの過剰数の和で書き表せる値のリスト
abundants :: [Integer]
abundants = [ x + y | x <- abundant, y <- abundant, (x + y) <= maxValue]
  where
    abundant = [n | n <- [minValue..maxValue], (divSum n) > n]

minValue = 12
maxValue = 28123

こちらはいつまで経っても処理が終わらない…。
elem n abundantsでabundantsの要素数が大きすぎるため時間がかかっていると思われる。

色々考えてみたがうまいやり方を思いつかなかった。
他の解法を調べてみたところ、どうもData.Arrayモジュールで配列を使うのがよくやるパターンのようだ。
abundants がソートされていれば、elemではなくn == head abundantsなどで
リストの先頭を取り除きながら比較することもできるかと思ったが、sortも終わらない…。

というわけで、配列を使うとこんな感じ。

なるほどなぁ…。
一度過剰数かどうかを調べてインデックスの数値が過剰数かどうかをTrue/Falseで持つ配列を作っておいて、
ある値からある過剰数を引いた値も過剰数かどうかを配列のインデックスから調べる、ということか。
過剰数のリストと過剰数かどうかを表す配列をそれぞれisAbundant nを呼んで作っている箇所を
もうちょっと何とかできそうな気もしたけれど、とりあえずこれで。

Problem 24. Lexicographic permutations

A permutation is an ordered arrangement of objects.
For example, 3124 is one possible permutation of the digits 1, 2, 3 and 4.
If all of the permutations are listed numerically or alphabetically, we call it lexicographic order.
The lexicographic permutations of 0, 1 and 2 are:

012 021 102 120 201 210
What is the millionth lexicographic permutation of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9?
Problem 24. Lexicographic permutations
順列とはモノの順番付きの並びのことである.
たとえば, 3124は数 1, 2, 3, 4 の一つの順列である.
すべての順列を数の大小でまたは辞書式に並べたものを辞書順と呼ぶ. 0と1と2の順列を辞書順に並べると

012 021 102 120 201 210
になる.
0,1,2,3,4,5,6,7,8,9からなる順列を辞書式に並べたときの100万番目はいくつか?
Problem 24 「辞書式順列」

こちらは頭の中で組み立てたロジックがうまく答えに繋がった感。
まずは小さな物事から考えてみよう。
例として挙げられている0と1と2で考えてみると、全ての組み合わせは

となり、辞書式に並べたときのn番目が知りたい場合、先頭要素を調べるのは簡単そうだ。
数値は3つ、数値を並べた全部で6通りあり、辞書順に並べると先頭要素は6/3=2個ずつ同じ数値が連続して表れる。
n番目が知りたい場合、n=1or2なら先頭要素は0, n=3or4なら先頭要素は1, n=5or6なら先頭要素は2。
n=5とすると、先頭が2に確定したので次の要素を調べたい。
次の要素以降は

となる。
知りたいn番目を算出し直す必要があるが、
「前要素で知りたいn番目-(同じ数値が連続して表れる個数 * [0,1,2]の中で前要素として確定した数値の要素番号)」で算出できる。
n = 5 - (2 * 2) = 1で、1番目が新たに知りたい番号。
こうやって再帰的に先頭の一要素ずつ確定させていけばよさそうだ。

頭の中で組み立てたロジックをHaskellのコードに落とし込み、0, 1, 2 の場合にうまくいくことを確認できた。
さらにそのまま問題の0～9の場合の100万番目も算出できることを確認できた。
思考の流れがうまくコードに落とし込めて想定通りに動いてくれた。
これは気持ちいい。