Why so few Computer Science degrees granted?

oldfort · June 20, 2012, 10:10pm

Are you familiar with K language, or KDB DB and analytical tool?

KDB and K are owned by Kx, and there are other software firms that are also using K for their software.

I am not the most technical person, this is what I have heard and read.

BCEagle91 · June 20, 2012, 10:19pm

I looked it up. Looks like a column-store database, also known as an inverted list. I haven’t heard the term inverted matrix before. Inverted lists have actually been around for a long time. Vertica (bought by Hewlett-Packard within the last couple of years) was one of the early companies on this - it had backing from Michael Stonebraker (now at MIT, one of the really big database guys like the late Jim Gray), Jerry Held (former VP at Oracle), and Stanley Zdonik (Brown).

I don’t think that I’d consider a database engine a higher-level-language. A database engine may provide the functionality of a HLL like PL/SQL in Oracle or Interactive SQL in other databases but it has runtime and other dependencies that typical languages don’t.

oldfort · June 20, 2012, 10:33pm

They used K (programming language) to write their own proprietary database. They can do real time analysis on large data faster than traditional DB. Retailers, telecom, Wall St are using their softwares (analytics and DB) to analyze their data. Most of those guys came from WS, they were part of R&D think tank. It was when WS had a lot more money to try to come up with the next generation to beat their competitors. Many best and brightest tech guys used to work on WS before Silicone Valley. Those guys left WS 10+ years ago, used their own money, and now they have found the next frontier.

I speak to many vendors and recruiters, and this seem to be the buzz.

BCEagle91 · June 20, 2012, 11:12pm

I worked on a proprietary database written in Bliss a long time ago. Is K
compiled or interpreted?

They sound like Vertica.

[Real-Time</a> Analytics Platform | Big Data Analytics | MPP Data Warehouse](<a href=“http://www.vertica.com/]Real-Time”>http://www.vertica.com/)

The traditional problem with column-store databases is that they are
not good in transaction processing environments. They are better
suited to data warehouse and analytics. You typically see business
operations using Oracle, DB2, SQL Server, Teradata, maybe even
PostGres with a data feed to a column-store database for data
warehouse and analytics.

The history of the database world has had new technologies that were
said to put the traditional companies out of business. Won Kim’s
object oriented database was supposed to put the relational vendors
out of business. Instead the relational vendors added object support
and the object-oriented companies went out of business. There’s a
history of the large players either buying out, hiring employees of
or implementing solutions on their own.

Kx lists HP, IBM, Intel, Oracle/Sun and Microsoft as their ISV
partners. Four of those companies own a bunch of major database
platforms: Oracle, MySQL, SQL Server, DB2, Vertica and Teradata.

The implication here is that TP operations are run on traditional
databases with feeds to Kdb. Similar to Vertica’s model.

Here’s a bio on Stonebraker - he’s been a giant in the database
industry since the early 1970s.

C-Store and Vertica

In the C-Store project, started in 2005, Stonebraker, along with
colleagues from Brandeis University, Brown University, MIT, and
University of Massachusetts Boston, developed a parallel,
shared-nothing column-oriented DBMS for data warehousing. By dividing
and storing data in columns, C-Store is able to perform less I/O and
get better compression ratios than conventional database systems that
store data in rows.

In 2005, Stonebraker co-founded Vertica to commercialize the
technology behind C-Store[

[Michael</a> Stonebraker - Wikipedia, the free encyclopedia](<a href=“http://en.wikipedia.org/wiki/Michael_Stonebraker]Michael”>Michael Stonebraker - Wikipedia)

I think that he’s built more databases than anyone else in the industry.

It sounds like your guys have some clients that work from data feeds
which is essentially static data. This would work well with a
column-store database becuase you don’t have to randomly change data
as in a Transaction Processing system.

Nothing particularly new about column-store databases. The concepts
go back to products from the 1970s.

oldfort · June 20, 2012, 11:32pm

Some of those guys have gone off to do transactional DB, some with static data in the data warehouse space. The technology may not be something new, but not many have put it in production. I have seen few demos and haven’t seen anything in house (“approved commercial products”) which were comparable.

BCEagle91 · June 20, 2012, 11:53pm

Vertica has been in production for a few years and they had the backing of Kleiner-Perkins and others and now are in HP’s stable.

Microsoft’s SQL Server 2012 database (it’s a traditional database relatively speaking) has column-store indexes. An article on column-store in SQL Server 2012:

Why not use a column store for everything?

While it’s possible to build a system that stores all data in columnar format, row stores still have advantages in some situations. A B-tree is a very efficient data structure for looking up or modifying a single row of data. So if your workload entails many single row lookups and many updates and deletes, which is common for OLTP workloads, you will probably continue to use row store technology. Data warehouse workloads typically scan, aggregate, and join large amounts of data. In those scenarios, column stores really shine.

I know more but am not at liberty to say.

SlitheyTove · June 21, 2012, 9:07am

Wait a minute, I was expecting a popcorn thread, and instead I’m getting something that could really use subtitles. 
:D

oldfort · June 21, 2012, 9:18am

It’s all Greek to me too. I turn into a different person late at night.

mathmom · June 21, 2012, 9:19am

I was thinking the last few posts are the best explanation yet for why there are so few computer scientists! :)

BCEagle91 · June 21, 2012, 9:28am

Back to programming languages after a diversion on databases:

I did some more reading on K and it’s an interpreted language which means that performance isn’t going to be anywhere near as fast as what you’d get using a compiled language. It’s an APL variant (we had APL terminals at Boston College back in the 1970s - I never saw anyone using them), apparently combined with Scheme (modern variant of LISP - also from the 70s or earlier).

I believe that the best performance is obtained from Intel compilers on Intel’s processors. They provide auto-vectorization and auto-parallelization which is pretty hard stuff to do. They also provide a bunch of high-performance object libraries in a variety of fields. The Intel libraries and auto-vector/parallel code can also do processor sniffing to determine capabilities and use the most efficient routines for the particular processor that you are running on. Their compiler may also incorporate the knowledge of instruction latencies per processor architecture for further optimization.

I had a look at K and couldn’t find anything on them using Vector SIMD instructions for acceleration. They state that they use vector acceleration but it’s not clear to me what they are referring to. It’s pretty hard to get ahead of Intel in this area. They start putting in new processor support into their compilers well before the processors are released so that the compiler support is there before the processors are released.

Getting back to the main thread - yes, you don’t need to know the hardware and low-level stuff if you mainly use Higher-Level Languages. But clearly someone has to know it to build those Higher-Level Languages. Or the high-performance low-level languages and libraries.

BCEagle91 · June 21, 2012, 9:32am

You really have to have a passion for this stuff and live and breathe the stuff to get through a CS program.

You can see this in any field, not just CS, and it comes through from time to time on all sorts of threads here.

oldfort · June 21, 2012, 9:57am

Here is going to be my disclosure…I know K very well in my previous life, so I know the speed and how it works. I was also an APL programmer, and there is no comparison between APL and K. If anyoen do a search, they would know which WS firm used APL, A+ and how K came about. Some of those propriatary DB using K is more than just using columns, their DBs and analytics (MI) are infinitely scalable by adding more hardware.

thumper1 · June 21, 2012, 9:58am

I am NOT enjoying my Twizzlers. I think I’ll switch to a different thread.

oldfort · June 21, 2012, 10:07am

Save your twizzlers, unless other posters come back. BCEagle is not fun (neither am I, really).

BCEagle91 · June 21, 2012, 10:16am

Perhaps you should update the Wikipedia article on the K language.

Most commercial databases support a variety of table types. The list of features in traditional databases today is absolutely huge. The feature lists on traditional databases is huge.

Then why does Google use MapReduce? Why is Hadoop so popular now?

The AMD64 architecture (2003-2004) provided for cheap and large memory spaces and in-memory databases followed several years later. Kdb is an in-memory database. Comparing in-memory databases to traditional disk-oriented databases is an apples-to-oranges comparison. There are some inherent problems with in-memory databases too (like losing power). If you have petabytes of data, an in-memory database might set you back a few bucks too.

oldfort · June 21, 2012, 10:21am

I just like to hang out with those people, picking up few things here and there. I guess I skipped J.
[Passage:</a> K.E. Iverson](<a href=“http://keiapl.info/rhui/passage.htm]Passage:”>Passage: K.E. Iverson)

jym626 · June 21, 2012, 10:23am

Is this thread in English?

oldfort · June 21, 2012, 10:26am

Greek or Chinese, but I would be able to understand Chinese.

BCEagle91 · June 21, 2012, 10:34am

It’s a discussion between a user and an engineer and as such has a broad impedance mismatch.

dadx · June 21, 2012, 10:39am

With no matching network in sight. :)

At least it seems that the original question is being answered. All you have to do is read the discussion. And I stand by my original comment to the effect that the bar is set rather high.

Why so few Computer Science degrees granted?

CONNECT WITH US