Big Data, Crystal Balls and Looking Glasses: Reviewing 2016, predicting 2017

Big Data, Crystal Balls and Looking Glasses: Reviewing 2016, predicting 2017
大數據,水晶球和鏡子:回顧2016,預測2017

End-of-year reviews are boring -- and everyone does them. Predictions are boring -- and they are hard. Of course, this is different -- because big data.
年底回顧很無聊—可每個人都要做回顧。預測未來很無聊--并且它們很難預測。當然,這是不同的--因為大數據。

How do big data people go about making end-of-year reviews and predictions? Using data is the obvious answer, but there's a few issues with that approach: there is no synthesis in data alone -- you have to find the story behind data, pick an angle and seek meaning. In addition, that approach does not account for subtle hints, industry knowledge, and big ideas.
搞大數據的人們是如何來年底回顧和來年預測的呢?使用數據是顯而易見的答案,但是這個方法有一些問題:數據里面沒有綜合的結論--你需要找到數據背后的故事,選取一個角度并且尋找它的意義。另外,那個方法不包含精確的提示信息,行業知識和大方向。

To paraphrase Carl Sagan, "we wish to find the truth, no matter where it lies. But to find the truth we need imagination and data both. We will not be afraid to speculate, but we will be careful to distinguish speculation from fact." In this spirit, let's keep things equally opinionated and objective in 2017.
卡爾薩根的意思是,“我們希望找到真相,無論它在哪里。但是為了找到真相,我們需要想象力和數據。我們不害怕推測,但是我們會很仔細從事實中獲取推測結果。” 在這種精神下,讓我們在2017同等主觀又客觀地看事情吧。

It's the end of Hadoop as we know it, and I feel fine
正如我們所知道的那樣,Hadoop要到頭了,我覺得還好。

Hadoop turned 10 in 2016. It's come a long way from a pet project named after a toy elephant to the (metaphorical) stampeding beast now in most every CXO's name-dropping list. The latest Big Data maturity survey showed that 73 percent of respondents are now in production with Hadoop (vs. 65 percent last year). And yet we're here to tell you Hadoop as we know it is dead. And that's not even news.
Hadoop在2016年表現的十全十美。它從一個以玩具大象命名的實驗項目成長到現在幾乎出現在每個首席官的炫耀名單里的狂奔的怪獸花了很長的時間。最新的大數據成熟度調查顯示百分之七十三的受訪者現在產品中都在使用Hadoop(相對去年是百分之六十五)。然后據我們所知Hadoop已死,而這幾乎不是新聞。

Hadoop has been constantly evolving, expanding, and re-inventing itself throughout its lifetime. A massive ecosystem has been developing around the initial bare-bones offering, and today Hadoop is more of a platform than "just" a storage and compute framework. The introduction of YARN was a game changer, enabling Hadoop to become a Big Data OS and to break away from its batch-oriented MapReduce origins.
Hadoop在它的生命過程中一直在持續的演進,擴張,和重新發明自己。圍繞著最初的基礎功能,Hadoop發展出了一個龐大的生態系統,并且今天它更像一個平臺,而不僅僅是一個儲存和計算的框架。YARN的引入顛覆了Hadoop,使得Hadoop成為了一個大數據操作系統,脫離了原來的面向批量操作的MapReduce。

In 2016, data and stories from the trenches all pointed to the same direction: batch, MapReduce Hadoop is dead, long live real-time, Spark Hadoop. 25 percent of organizations are using Spark in production today with an additional 33 percent using it in development, and all major Hadoop vendors are involved in it. Adding up suggests that by the end of 2017 up to 50 percent of organizations could be using Spark in production.
在2016年,現實中的數據和事例都指向了同一個方向:批處理,MapReduce Hadoop已死,實時處理萬歲,Spark Hadoop。現在百分之二十五的組織中線上產品中都在用Spark,另外有33%正在使用Spark做開發,并且所有主流的Hadoop服務商都參與到Spark中了。到2017年底,加起來會有多達50%的公司在它們的線上產品中使用Spark。

But it's not necessarily a Spark or bust future: neither is Spark the only streaming game in town, nor is Hadoop the only Big Data platform. Alternatives do exist, and users may migrate or leapfrog to them skipping Spark or Hadoop altogether, the same way they are now migrating from or skipping MapReduce.
Spark未來會興盛還是蕭條都不一定:Spark既不是唯一最好的大數據平臺,Hadoop也不是僅有的大數據平臺。可選方案確實存在,用戶可以遷移到或者跳過Spark和Hadoop到它們上面去,就像現在人們正從MapReduce遷移出去或者跳過MapReduce一樣。
[圖片上傳中。。。(1)]
The Big Data landscape is host to a multitude of different approaches. But more and more it looks like everyone is adding everyone else's features. Convergence or me-too? Image: Martin Kleppmann.
大數據框架是基于許多不同方法的。但是看起來每個模塊都在加入越來越多其余模塊的功能。聚合還是復制?圖片:Martin Kleppmann
**

Becoming all things to all men to save some
成為滿足所有人的萬能者來保留用戶
Spark can do both streaming and batch processing. And it can also do SQL, and graphs. And of course on Hadoop you can also do SQL and/or NoSQL in a number of other ways, utilizing a wide choice of tools. That's what being an ecosystem is all about, right? But then again, everyone seems to be at it these days.
Spark既能做流處理也能做批量處理。它也能處理SQL和圖片。當然在Hadoop上你也能通過使用許多可選的工具來處理SQL和/或NoSQL。這是作為一個生態系統所應該做的,是嗎?但是再說一次,每個大數據系統現在看起來都是這樣子的。

NoSQL databases like Cassandra / DataStax Enterprise can now also do graph, in addition to key-value, tabular and document. What about the iconic NoSQL document store - MongoDB? Well, besides document, you can now also do SQL . Microsoft's SQL Server? Youraverage SQL server no more: it can run on Linux, it supports R, in-memory processing and column store. MariaDB, the poor man's SQL server, also has its column store now.
像Cassandra / DataStax Enterprise 這樣子的NoSQL數據庫在能處理鍵值,格式化和文檔之外現在也能處理圖片。那著名的NoSQL文檔庫MongoDB怎么樣呢?好吧,除了文檔,你也能使用SQL了。微軟的SQL Server呢?它不再是你認識那個平庸的SQL服務器了:它能再Linux上運行,它支持R語言,內存運行和列存儲。MariaDB,窮人的SQL服務器,它現在也支持列存儲了。

Neo4J, the iconic graph store? It's going ACID. Google's BigQuery now supports standard SQL , joining Amazon Redshift that has had it for a while as it's based on Postgres. Of course, analytics-oriented column stores have long supported SQL. And traditional relational DBs like Oracle and IBM have been adding features like in-memory processing and column store for a while as well. Key-stores do it, document-stores do it, graph-stores do it, even SQL incumbents do it.
Neo4J, 典型的圖形數據庫?它也要支持ACID了。谷歌的BigQuery現在支持標準SQL,Amazon Redshift使用了BigQuery一段時間了因為它基于Postgres。當然,面向統計的列存儲數據庫長久以來就支持SQL。傳統的關系型數據庫像Oracle和IBM也一直在增加像內存處理和列存儲這樣子的功能。鍵值存儲數據庫這樣子,文檔存儲數據庫這樣子,圖形存儲數據庫這樣子,甚至就連SQL數據庫也是如此。

The boundaries are blurring, as more and more data platforms try to be more things to more people. Doing most everything on the same platform is good for vendors that want to increase their retention and good for users who don't want to have to mix and match disparate platforms to get things done. But it's not a sheer land-ho of opportunity - threats lie ahead too. Most notably, vendor lock-in, half-baked features, and half-hearted users.
因為越來越多的平臺都在為更多的人群提供更多的功能,平臺之間的界限正越來越模糊。對于想增加客戶保留率的供應商和不想混用和拼接不相干的平臺來達到目的的用戶來說,在相同的一個平臺上把幾乎所有事情都做了是極好的。但是它并不是一個純粹的充滿機會的土地,危險也同樣存在. 最顯著的問題有,供應商鎖定,半吊子功能和意興闌珊的用戶。
[圖片上傳中。。。(2)]
Some are trying to get the basics right, while some are after up in the sky goals. Yet, there's a place for everyone under Big Data. Image: Martin Kleppmann
一些人在為了基本的權利而努力,同時一些人在追求遠大的目標。然而,大數據下每個人都有自己的容身之地。 圖片:Martin Kleppmann

This article is from http://www.zdnet.com/article/big-data-crystal-balls-and-looking-glasses-reviewing-2016-predicting-2017/

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 229,763評論 6 539
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 99,238評論 3 428
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 177,823評論 0 383
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 63,604評論 1 317
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 72,339評論 6 410
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 55,713評論 1 328
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,712評論 3 445
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 42,893評論 0 289
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 49,448評論 1 335
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 41,201評論 3 357
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 43,397評論 1 372
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,944評論 5 363
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,631評論 3 348
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 35,033評論 0 28
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 36,321評論 1 293
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 52,128評論 3 398
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 48,347評論 2 377

推薦閱讀更多精彩內容