Golang context源码阅读与分析

Posted by 夏泽民

https://jiajunhuang.com/articles/2020_04_22-golang_context.md.html Golang中使用context作为goroutine之间的控制器,例如:



Colly外的又一Go爬虫框架 — Goribot

Posted by 夏泽民

https://segmentfault.com/a/1190000022452452 gocolly是用go实现的网络爬虫框架,目前在github上具有3400+星,名列go版爬虫程序榜首。gocolly快速优雅,以回调函数的形式提供了一组接口,可以实现任意类型的爬虫。



How to Manage Database Timeouts and Cancellations in Go

Posted by 夏泽民

https://www.alexedwards.net/blog/how-to-manage-database-timeouts-and-cancellations-in-go One of the great features of Go is that it’s possible to cancel database queries while they are still running via a context.Context instance (so long as cancellation is supported by your database driver).



binlog

Posted by 夏泽民

https://dev.mysql.com/doc/refman/5.7/en/gis-wkb-functions.html https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/36726.pdf https://dev.mysql.com/doc/internals/en/binlog-row-image.html https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html https://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html https://dev.mysql.com/doc/refman/5.7/en/binary-log.html https://dev.mysql.com/doc/refman/5.7/en/mysqlbinlog.html https://dev.mysql.com/doc/internals/en/binary-log.html



Parquet

Posted by 夏泽民

Apache Parquet是一种能够有效存储嵌套数据的列式存储格式。 Parquet文件由一个文件头(header),一个或多个紧随其后的文件块(block),以及一个用于结尾的文件尾(footer)构成。文件头仅包含 Parquet文件的每个文件块负责存储一个行组,行组由列块组成,且一个列块负责存储一列数据。每个列块中的的数据以页为单位 为什么我们选择parquet 前用的hadoop,一直有个疑惑。当时没有细究,昨天突然想到,就又顺着看了下,经过调整,原来在presto中要用1分钟的,现在基本可以秒级别出结果,和presto无关,和文件存储格式有关,hdfs默认存的是文本格式,所以hive,presto,都是在文本格式上做计算,hadoop本身是全表扫,只是分布式而以,所以我们之前用的就是分布式的全表扫而以,没有发挥出数据仓库该有的功能,列式存储,天然擅长分析,千万级别的表,count,sum,group by ,秒出结果!!



Search

Popular posts

Anything in here will be replaced on browsers that support the canvas element

Recent posts

This blog is maintained by 夏泽民

Get in touch with me at 465474307@qq.com

Subscribe to our mailing list

* indicates required