Thinkなブログ: 言語処理100本ノック　問題03

2020年12月1日火曜日

言語処理100本ノック　問題03

問.“Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.”という文を単語に分解し，各単語の（アルファベットの）文字数を先頭から出現順に並べたリストを作成せよ．

解答例

import re

str = 'Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.'
str = re.sub('[,\.]', '', str) # ,と.を除去
splits = str.split() # スペースで区切って単語ごとのリストを作成
ans = [len(i) for i in splits]#lenは文字数をカウントする関数

print(ans)

出力

[3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9]

では、「,と.」を除去せずにプログラムを実行するとどうなるか。

プログラム

import re

str = 'Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.'
splits = str.split()
ans = [len(i) for i in splits]

print(ans)

出力

[3, 1, 4, 1, 6, 9, 2, 7, 5, 3, 5, 8, 9, 7, 10]

,や.も1文字として数えてしまっている。呼び出しているライブラリreは正規表現（re:regular expressions）に関するもの。

別解

raw_text = 'Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.'

#raw_text中の.と,を""（なにもなし）に置き換える

text = raw_text.replace('.', '').replace(',', '')

#textを区切り文字で分割し(text.split())、文字数を数えるlen関数で最初から数えていく。
ans = [len(w) for w in text.split()]
print(ans)

出力

[3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9]

ちなみに「print(text)」に対する出力はこうなる。

出力

Now I need a drink alcoholic of course after the heavy lectures involving quantum mechanics

参考リンク

【言語処理100本ノック 2020】第1章: 準備運動

Pythonのモジュールreで正規表現操作を行う方法【初心者向け】

Pythonドキュメント：re --- 正規表現

言語処理100本ノック 2020「03. 円周率」（別解）

Thinkなブログ

2020年12月1日火曜日

言語処理100本ノック　問題03

0 件のコメント:

コメントを投稿

フォロワー

ブログアーカイブ

Thinkなブログ

2020年12月1日火曜日

言語処理100本ノック 問題03

0 件のコメント:

コメントを投稿

フォロワー

ブログ アーカイブ

言語処理100本ノック　問題03

ブログアーカイブ