Friday, August 19, 2005

language fetishists: it's not the semantics

java. ruby. python. c#. dylan. bah.

that last word isn't a language, but rather the expression of my disatisfaction with the language fetishists of the world who go on constantly about how cool their languages are. or how about "common language runtimes". oooh, aaaah!

i've worked with a number of languages in my career and it's true that there are some really great powerful languages (e.g. c++) and some really fun and consistent ones that make development quick (e.g. ruby) and ones that just suck (e.g. lisp or fortran).

(how long until i get my first "but lisp is amazing, you insensitive clod!" email? to be honest, even if i overlook the obsessive use of parenthesees, i just have a hard time thinking functionally the way lisp wants me to. it's doable, just odd.)

but it isn't the language at hand that has always been the biggest hurdle for me, it's been the way we still write code. it really hasn't changed much in the last, what, four decades? we open up text files in text editors and pound away at the keyboard. we've given writers tools like word processors, typesetting apps and even screnwriting programs that have freed them from pounding on typewriters, but have yet to really free ourselves from the prison we have invented for ourselves. and most of us are hardly aware of it, i think.

we happily continue to create header and implementation files, or come up with different ways of splitting up classes into manageable sizes (gotta love those 15k line files!). and let's not even get started on directory layouts.

but we must realize this sucks, because we've invested innordinate amounts of time building text editors that do funky syntax highlighting, block collapsing, bookmarking and more and then throw those inside complex environments which manage projects and pull apart the code symantically to give us code completion and the ability to jump around semantically such as by class.

and when we want to re-use code? or refactor large bodies of it? well, we search through the text files! so we've built source code search engines and tools to autogenerate api documentation and on and on and on ...

and what about revisioning and unit testing? yep. yet more external tools.

at what point do we step back and go: "wtf?! we keep making flat files, but we don't really want flat files. instead we keep trying to make them into relational databases and semantic webs because that's actually how we work." apparently we haven't reached that point.

now, before get the wrong idea: i'm not leading up to saying we should program visually. that's a load of crap suitable for only the most encapsulatable and simple of tasks. textual programming is fine. how we store and manage that text isn't.

so all you language fetishists and ide developers .... how about concentrating on the thing that really makes programming suck? build an environment that puts all the pieces of the code as i type it into a semantic web and shove it all into an indexed database somewhere.

revisioning, unit testing, build conformance, api documentation, code cross referencing (and thereby code completion), design recovery, searching, re-use and refactoring ... all of this would be so much easier, faster (as in fewer cpu cycles and fewer tools to set up and master) and natural if the data was stored the way we use it.

i have a dream, and in this dream there are no more flat files sitting on my hard disk full of code as if it were 1970 all over again. in this dream my unit tests are run when i commit code which happens when i save the file and mark it as "final" in my editor without me having to set up one darn thing. and in this dream it doesn't take a server to provide the horsepower needed for indexing and api documentation flows as a natural course of creating the code. and in this dream when i make a change to trunk/ porting it back to 3.4 and then porting it forward to 4.0 isn't a pain in the ass that takes up a bunch of my coding time.

</rant>

23 comments:

Roberto said...

LEO

James D said...

Hear hear.

James D said...

Whoops, meant to write a bit more than just that. Yes, I've been thinking about the same problems. Unfortunately I believe C++ is the very thing holding us back from your imagined programming utopia, because of the difficulty of parsing and extracting meaning from it. The complexity of the syntax; the way the preprocessor randomly munges text without regard for the syntax; the way things depend on the order they were defined; these things make it terribly hard to automatically do anything useful with C++ code in a reasonable amount of time (including compiling it!).

Languages like Java and C# don't have this problem. If you've never used C#, I would suggest trying it just to see how fast it compiles. Projects are built like lightning; I would say easily 10 times faster than C++. The simpler syntax and tighter rules make it possible to do much more than just compile, too: Intellisense is 100% up-to-the-second accurate, class diagrams update themselves automatically, and refactoring tools (only still in their infancy) can do amazing things with 100% correctness. *This* is the kind of language improvement we need, not fancy syntactic sugar for regexps or enforcement of a functional programming style.

Procedural OO programming is good enough for now; the next programming language improvements shouldn't be aimed directly at improving humans' ability to write code but at enabling automated tools to analyze, organize, and document the code to assist us.

Saem said...

Separate storage format and editting format, vi and emacs losers wouldn't hear of it. Because the storage format would likely be made to be super parser friendly like s-expressions of xml, while special tools handled presenting them as whatever more human friendly format that one might desire.

Saem said...

Also, I suggest checking out fowler's article on lambda the ultimate about the language workbench.

Vladimir Prus said...

"Semantic web" sounds great, but there's lot of incremental improvements that can be made. Say, gcc tends to add many Mb of debug info to each object file it creates, spending lots of time on that, and fails to do this accurately, so gdb knows nothing about half the classes. And for those it knows about, it can't do much, because C++ is a foreign language for gdb -- forget about overload resolution, you can't even say "std::cout << my_object" in gdb. So, add at least 100% to your debugging time.

Some workflow features -- like automatically running unit tests on save, and automatically launching the failed ones in debugger can also be very handy.

As for parsing C++ -- the problem is not that it's hard, the problem is gcc folks haven't made a standalone parsing library.

As a side remark, there's proposal to add modules in C++:
http://open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1778.pdf

James D said...

Ah, I remember you had that argument about GCC with Roberto Raggi on kdevelop-devel. Well I won't repeat that here; people can refer to the discussion if they want to read about the difficulties of real-time C++ parsing.

That module proposal is very interesting; it sounds like a big step in the right direction for C++. Unfortunately I fear it will be a very long road through standardization and implementation before that technology gets to the point where it can be used in KDE.

Kleag said...

Hey, what you describe looks a lot to Smalltalk: no folders, no file, no extra syntax ; only packages and classes in an environment that can do automatic refactoring and all what you dream about.
And it's here since the 70th...
I used during some years and it was a fantastic programming experience! But all the tools were proprietary... I did not try extensively the GNU version.

Aaron J. Seigo said...

smalltack: not even close.

c# being faster to compile: you've missed the entire point of this blog entry, haven't you =)

incremental improvements to things: i think the model we use is more broken than that.

vim/emacs people not wanting to use it: well, emacs would have a plugin available for the new format rather quickly, i bet. and seeing as how many people use IDEs these days, i somehow doubt it would be a show stopper =)

StyXman said...

why smalltalk is not it? it's got everything you're ranting^H^H^H^H^H^H^H asking for!

also, it's simple to learn, because it's almost all library, very small syntax and no obscure corners like the ones *cough*C++*cough* has (think about what static means for local variables, global variables and functions).

PS: why-oh-why must I be a blogger/have a blogger account to post a comment?

segedunum said...

or how about "common language runtimes". oooh, aaaah!

Please don't mention CLRs. I've commented a fair bit in the past of why I think VMs, and especially Microsoft's idea of running everything in a stupid 'environment within an environment', are just utterly pointless. I can see the advantages of some sandboxing with Java and a JVM, but I do not see the advantages of trying to run everything in it.

Of course, Microsoft has a plan for their CLR and that is to make it the only way of writing and running applications on Windows for everyone outside Microsoft ;-). Why? Well it's to drive demand for three letters.

We need to seriously get people off Windows and on to another OS and desktop, soon. Certainly, if independant companies like Trolltech are to survive they must generate more revenue away from Windows development. Your work is more important than you know.

segedunum said...

The simpler syntax and tighter rules make it possible to do much more than just compile, too: Intellisense is 100% up-to-the-second accurate

You've eaten too many of those .Net and C# books for breakfast. None of that does what Aaron describes, and I know because I have to use it every day. It is still flat files, no matter how much utter crap you churn out every couple of years:

"now, before get the wrong idea: i'm not leading up to saying we should program visually. that's a load of crap suitable for only the most encapsulatable and simple of tasks. textual programming is fine. how we store and manage that text isn't."

*This* is the kind of language improvement we need, not fancy syntactic sugar for regexps or enforcement of a functional programming style.

What a load of tosh. If you actually knew what the .Net framework was you would realise that it is just one massive wrapper around Win32! Oh, and what you're describing there has absolutely nothing to do with the language. You can quite easily do all that for C++.

On one course I got thrown on the guy taking it drew a little rectangle and told us that was the size of the .Net framework today. Then he drew an even larger rectangle around it, the size of the board, and told us that was how far Microsoft was going. I caught up on quite a bit of sleep that week.

John said...

"build an environment that puts all the pieces of the code as i type it into a semantic web and shove it all into an indexed database somewhere."

Hmmm, sounds a bit like the language I use in my professional life, something called ObjectStar (http://www.objectstar.com), a 4GL RAD Database development environment with it's own database, but also speaks natively to just about every database on the planet. I use it on mainframe, but it also runs on Windows, UNIX and Linux servers. O* has the concept of a metastore, all the code, table definitions, screen layouts, everything, is stored in database tables, can be searched and selected from within your code, heck you can even change your code or table defintions on the fly with a database access statement.

The Rules language is semi-interpreted, the editor only allows 35 statements of code in a Rule (analogous to a module or object) and when you save the Rule it is tokenised and saved in a set of tables. From the view-point of the code, everything is a database table: 3270 screens, gui's, flat files, memory, everything is accessed with the SQL-like data-access language.

It's a fun system to work in, and fast to knock out new code, but it does take a performance hit from all the interpreted code and all that database access.

Its been around since the 80's, and is slowly dying off thanks to a lack of market share, but it proves that your idea is not so silly and is workable.

Cheers!

John

Alexander Kellett said...

tis smalltalk of which you speak aaron.

this answers nothing though. flat files ain't the problem. you ask for something that will not affect you at all, it'll only make the tools ever so slightly easier to write. not that this is a bad thing in itself, but non of these issues that you talk about will be automatically solved by making the storage semantic.

anyone can hack up a brute force algorithm. but coding it the Right Way, and getting it running fast in ruby utterly Sucks. for *many* problems you waste so much time attempting to improve the algorithm speed that you may as well have brute forced it in java (or c++, or whatever)

lypie

Alexander Kellett said...

oops. deleted a paragraph while attempting to figure out what the hell my blogger login was ;)

insert between the main two paragraphs:

the following is the real reason the language you use makes no difference.

and the summary at the end may as well read:

it all balances out at the end of the day. you save massively on token count by using ruby (see paul grahams texts on this). but you have to learn one of the other languages in any case to slap out that brute forced code.

end product? you end up smearing yourself across several skill sets and having to integrate multiple languages and domains, which we all know is teh suck.

(lypie, again)

James D said...

Au contraire, I think you are the one who missed *my* point. While I did mention that C# was faster to compile, that was not my *point*. My point was that the design of C# as a language with simpler syntax (among various other design decisions) makes it far more suitable than C++ for use in a next-generation IDE with super-intellisense, refactoring, and other goodies that rely on parsing and extracting meaning from code. Did you really miss it that badly, or were you just trying to be snarky?

James D said...

Oh and segenudum, I'm *not* saying that C# is not put in flat files, or that it is anything like what Aaron is describing here. Geez, you can't even mention C# around here without getting flamed. I'm *not* saying that we should all drop C++ tomorrow for C#, or that C# is perfect and flawless. *All* I am saying is that C# is easier to compile/parse/refactor than C++, and that is an example of the kind of innovation we need in the next new programming languages (as opposed to new syntax or novel flow control statements). Can you not see past your language bigotry long enough to understand the point of my post?

8709 said...

Alot of interesting comments on this blog, I was searching for some doctor related info and some how cam across this site. I found it pretty cool, so I bookmarked. I'll really liked the second post on the front page, that got my attention.

My site is in a bit different area, but just as useful. I have a mens male enhancement reviews related site focusing on mens male enhancement reviews and mens health related topics.

TS said...

Nice Blog!!!   I thought I'd tell you about a site that will let give you places where
you can make extra cash! I made over $800 last month. Not bad for not doing much. Just put in your
zip code and up will pop up a list of places that are available. I live in a small area and found quite
a few. MAKE MONEY NOW

Dream Builder said...

Hey, you have a great blog here! I'm definitely going to bookmark you!
I have a work at home medical billing
site. It pretty much covers work at home medical billing
related stuff.

Come and check it out if you get time :-)

Steve Austin said...

Interesting blog. I have a microsoft xml parser blog.

Tito Maury said...

I was searching blogs for programming sql and found your entry” language fetishists: it's not the semantics Its not a perfect match but Tis the season! So I thought I would write. There is lot of info on Embroidery stuff out there. I have been looking to add content to our site about Patterns and Sewing. If you have any good resources please share -- feel free to take a look at our site at programming sql
I just read your blog entry - language fetishists: it's not the semantics -- my partner and I are planning on setting up a review website – it will live on the home page of our site programming sql. I am looking for versatility in embroidery machine and setup a comparison. If you can help with any insight or direction that would be great. I am also would like to hear about Academy's courses with hands-on embroidery that people have taken.
I just found your blog entry -- language fetishists: it's not the semantics -- I am starting my own blog – you find frequently updated news, commentary, and the latest links about graduate school for applicants, current graduate students, post-docs, and faculty. programming sql If you come across any timely links or stories of interest to applicants, students, and faculty, share them with me -- so I can share them with all of our readers. Please visit us programming sql

Anonymous said...

http://www.martinfowler.com/articles/languageWorkbench.html#ElementsOfALanguageWorkbench