其他分享
首页 > 其他分享> > codeforces963D. Frequency of String【哈希】

codeforces963D. Frequency of String【哈希】

作者:互联网

我的腿让我停下,可是我的心却不许我这么做

今天又是为了明知多半不可能的事情奔波一早,一天里,出了很多丑,犯了很多错,见了很多人,有了很多意想不到的收获,我选择了我的生存方式,我努力地撒野生长。现在是凌晨一点了,我不敢去睡觉,因为夜幕总会揭开我的绝望,总是这样。武汉大学的校训是什么来着?自强弘毅?自强我懂,弘毅我不想懂。无聊地翻翻知网,现在已经能随意下载。图书馆里也有很多外文资料,那也是我唯一的栖居地了。我不理解为什么刚开学就要把自己搞得这么忙而碌碌,或许,我只是不想把能花精力解决的问题,变成需要时间接受的结局。接受每一步,就不能害怕接受;可为了前进,就不能甘心接受———那么我到底该怎样?一辈子活在过去的阴霾下,即使都麻木到看不透阴霾埋藏之物,在黑暗中独步行吟,仔细分析题意,发现不同的字符串长度最多只有\(\sqrt{n}\)个,所以可以按长度分类,而每种长度内部都是\(O(n)\)的,总复杂度即是\(n \sqrt{n}\)的。对每一长度中涉及的询问字符串,处理的方法就是暴力遍历,求出所有目标串的起始点(方法太多,这里姑且用\(hash\)解决,之后就显然了。

$click$ $for$ $codes$
# include "bits/stdc++.h"
using namespace std;
constexpr int N = 1e6 + 3;
unsigned long long h[N], b[N], h_2[N]; // array space also bothered me for a time. Actually, I still havn't figure out what's wrong……
int length[N], id[N], k[N];
vector<int> ans[N];
char str[N], q_str[N];
int main() {
//	ios::sync_with_stdio(0), cin.tie(0), cout.tie(0); // if we turn off the I / O synchronization stream and use both cin and scanf, it will cause a special error. Perhaps a space problem, I suppose 
	scanf("%s", str + 1);
	int len = strlen(str + 1);
	b[0] = 1;
	constexpr unsigned long long base = 1331; // hash pretreatment
	for(int i = 1; i <= len; ++i) {
		h[i] = h[i - 1] * base + str[i];
		b[i] = b[i - 1] * base;
	}
	int n;
	cin >> n;
	unordered_map<unsigned long long, int> mp;
	for(int i = 1; i <= n; ++i) {
		char q_str[len + 1];
		cin >> k[i];
		scanf("%s", q_str + 1);
		id[i] = i;
		length[i] = strlen(q_str + 1);
		for(int j = 1; j <= length[i]; ++j) {
			h_2[i] = h_2[i] * base + q_str[j]; // natural overflow // I wonder why some guy like persecuting others who prefer natural overflow, maybe they worked under Hitler in their previous lives :) 
		}
		mp[h_2[i]] = i;
	}
	sort(id + 1, id + n + 1, [&](int &x, int &y) {
		return length[x] < length[y]; // sort by length to classification
		/*
			if we write length[id[x]] < length[id[y]], there seems to be no differnce as id[i] is equal to i. However, as the original sequence changes gradually, this complex sentence no longer maintains its accurancy anymore
		*/
	});
	int i = 1;
	while(i <= n) {
		auto hash = [&](int l, int r) -> unsigned long long {
			return h[r] - h[l - 1] * b[r - l + 1]; // hash's magic, O(1) -> substring
		};
		for(int j = 1; j + length[id[i]] - 1 <= len; ++j) { // enumerating substring
			auto m = hash(j, j + length[id[i]] - 1);
			if(mp.find(m) != mp.end()) {
				ans[mp[m]].push_back(j); // put the starting position of each substring into the answer array
			}
		}
		while(i <= n && length[id[i]] == length[id[i + 1]]) ++i; // handle the same length case
		++i;
		if(i > n) break;
	}
	for(int i = 1; i <= n; ++i) {
		if(ans[i].size() < k[i]) {
			printf("-1\n");
		}
		else {
			int minn = 0x7fffffff;
			for(int j = k[i] - 1; j < ans[i].size(); ++j) { // considering the input mode of vector, the subscript is reduced by one
				minn = min(minn, ans[i][j] - ans[i][j - k[i] + 1] + length[i]); // enumerating calculation minimum length
			}
			cout << minn << "\n";
		}
	}
	return 0;
}

标签:hash,codeforces963D,int,scanf,哈希,unsigned,long,Frequency,str
来源: https://www.cnblogs.com/bingoyes/p/16607042.html