带有自引用查询的mysql更新
作者:互联网
我有一个调查表,其中包含(以及其他)以下列
survey_id - unique id
user_id - the id of the person the survey relates to
created - datetime
ip_address - of the submission
ip_count - the number of duplicates
由于记录集很大,因此动态运行此查询是不切实际的,因此尝试创建一个更新语句,该语句将在ip_count中定期存储“缓存”结果.
ip_count的目的是显示重复的ip_address调查提交的数量已经收到相同的user_id,期限为12个月(/ – 创建日期的6个月).
使用以下数据集,这是预期的结果.
survey_id user_id created ip_address ip_count #counted duplicates survey_id
1 1 01-Jan-12 123.132.123 1 # 2
2 1 01-Apr-12 123.132.123 2 # 1, 3
3 2 01-Jul-12 123.132.123 0 #
4 1 01-Aug-12 123.132.123 3 # 2, 6
6 1 01-Dec-12 123.132.123 1 # 4
这是我迄今为止提出的最接近的解决方案,但是这个查询没有考虑到日期限制并且努力想出另一种方法.
UPDATE surveys
JOIN(
SELECT ip_address, created, user_id, COUNT(*) AS total
FROM surveys
WHERE surveys.state IN (1, 3) # survey is marked as completed and confirmed
GROUP BY ip_address, user_id
) AS ipCount
ON (
ipCount.ip_address = surveys.ip_address
AND ipCount.user_id = surveys.user_id
AND ipCount.created BETWEEN (surveys.created - INTERVAL 6 MONTH) AND (surveys.created + INTERVAL 6 MONTH)
)
SET surveys.ip_count = ipCount.total - 1 # minus 1 as this query will match on its own id.
WHERE surveys.ip_address IS NOT NULL # ignore surveys where we have no ip_address
谢谢你的帮助提前:)
解决方法:
我没有你的桌子,所以我很难形成一个肯定有用的正确的sql,但我可以为此拍摄,希望能够帮助你…
首先,我需要对自己进行调查的笛卡尔积,并过滤掉我不想要的行
select s1.survey_id x, s2.survey_id y from surveys s1, surveys s2 where s1.survey_id != s2.survey_id and s1.ip_address = s2.ip_address and (s1.created and s2.created fall 6 months within each other)
此输出应包含匹配(根据您的规则)TWICE的每对调查(一次针对第一个位置的每个id,一次针对它位于第二个位置)
然后我们可以在这个输出上做一个GROUP BY来得到一个表,它基本上给了我每个survey_id正确的ip_count
(select x, count(*) c from (select s1.survey_id x, s2.survey_id y from surveys s1, surveys s2 where s1.survey_id != s2.survey_id and s1.ip_address = s2.ip_address and (s1.created and s2.created fall 6 months within each other)) group by x)
所以现在我们有一个表将每个survey_id映射到正确的ip_count.要更新原始表,我们需要将其与此连接并复制值
所以应该看起来像
UPDATE surveys SET s.ip_count = n.c from surveys s inner join (ABOVE QUERY) n on s.survey_id = n.x
那里有一些伪代码,但我认为一般的想法应该有效
我之前从未根据另一个查询的输出更新表.试图从这个问题中猜测正确的语法 – How do I UPDATE from a SELECT in SQL Server?
此外,如果我需要为自己的工作做这样的事情,我不会尝试在单个查询中执行此操作.这将很难维护并且可能存在内存/性能问题.最好让脚本逐行遍历表,在事务中的单行更新,然后再转到下一行.更慢,但更容易理解,可能更轻松的数据库.
标签:mysql,self-join 来源: https://codeday.me/bug/20190626/1291259.html