
Submitted by Tomasz Wegrzanowski (Sun Aug 20 03:31:30 UTC 2006)
Currently Ruby standard library uses C-style pseudo-symbols all over the place. For example to create a socket one has to say:
socket = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
Instead of more Ruby-like:
socket = Socket.new(:inet, :stream, 0)
This is highly verbose - one needs to remember many multi-level namespaces (here Socket::SOCK_* and Socket::AF_*) and also highly error-prone. It is very easy to make a mistake like:
socket = Socket.new(Socket::SOCK_STREAM, Socket::AF_INET, 0)
and it doesn't even throw an error, just creates a wrong kind of socket, what is very hard to debug.
We can make Socket::SOCK_STREAM evaluate to Socket::SOCKEnum object (some Ruby libraries do something like that), but this would be ugly. It's much easier to simply use Symbols.
Second, make all existing pseudo-symbols evaluate to real symbols. So Socket::SOCK_STREAM will evaluate to :stream etc. This way most programs will continue to work.
Third, optional, is to define Symbol#| to return some object that can be converted to a number in a right way later. Probably not an Array, because Array#| already does something, and :a|:b|:c would mean [:a,:b]|:c what would be bad.
Maybe we should let such functions accept integers too, so people can pass nonstandard pseudosymbols. This would be very rarely used of course.
This proposal is mostly backwards-compatible, however there are a few cases where it's not.
Case 1 - if the fact that pseudosymbols are integers is actually used in a program. The most common case is probably flags, like open("file", Fcntl::O_CREAT|Fcntl::O_EXCL). We can make it backwards-compatible by defining Symbol#| or accept the incompatibility. I'm not sure whether Symbol#| is a good idea or not. If we added Symbol#|, the backwards compatibility would be pretty much complete in this case.
Case 2 - we don't really know what namespace to use. Most calls like Socket#new know how to convert each of their arguments. Some however don't, like ioctl(2), and they'd need more complex solutions.
It should not be slower in most cases.
Both:
socket = Socket.new(Socket::SOCK_STREAM, Socket::AF_INET, 0)and
socket = Socket.new(:stream, :inet, 0)have to convert symbols to numbers at run-time. Only one does it before the call, and other after the call. Well, unless Ruby does some sort of optimization here, but both cases seem to be as optimizable.
Because of the slight backwards incompatibility, it's best to do the switch when moving to Ruby 2.

| Comments | Current voting | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|


RCRchive copyright © David Alan Black, 2003-2005.
Powered by Ruby on Rails.
again forget signature. Ondrej Bilka
I have an impression that it should be fairly simple to code it. If most of us agree that this is a good idea, I can try doing some implementation. But it's better to ask first and code later, especially if the coding has to be done in C :-)
Number of RCRs aimed at this problem is really huge, so it seems the feeling that something ought to be done about it is quite prevalent. I think this RCR is the most backward-compatible and most elegant so far, but well - I'm ceratinly biased here, so I'd like to hear your views too :-)
Translating C code to
is pretty straight forward, whereas
is not. One might confuse to choose :SOCK_STREAM or :stream or :STREAM or something totally different. I am not refusing this idea, but there must be some specific rule to convert C code to Ruby symbolic code.
matz.
As far as I can tell, C symbols in almost all libraries are in all-uppercase, and even when they use mixed case, it is almost unheard of to have different cases mean different things. Grepping /usr/include on a Ubuntu system (33520 different symbols that are #defined as numbers) I've only found one such case in X11/keysymdef.h where Unicode characters are #define'd and case matters: etc.
So we can simply select any rules for character case. I propose :all_lower_case_with_underscores as this would be most consistent with the rest of Ruby.
So the only problem left is whether to use :stream or :sock_stream, as about 67% pseudosymbols have two or more underscores. I think that most of the time it should be obvious what's "the right thing" to do (usually cutting the first part, or first two parts if they are always together like in XML_SCHEMAS_ELEM_DEFAULT becomes :elem_default). Even in cases where it's not - it's often better than current situation. If the RCR gets accepted the choice will be between :o_rdonly and :rdonly. Now one must guess whether it's Socket::O_RDONLY or IO::O_RDONLY or File::O_RDONLY or Fcntl::O_RDONLY or something else.
There is one more problem - some C symbols have numbers just after underscore. As many as 7.5% have a number following some underscore, but most of them are safe and the number is somewhere further in the symbol, like GL_DOT_PRODUCT_TEXTURE_1D_NV. However 1.4% have number following the first underscore, and in Ruby they'd need to be translated to :"3d_color_texture". This isn't perfect, but we can probably live with it.
-- Tomasz Wegrzanowski
[You wrote]"I have an impression that it should be fairly simple to code it. If most of us agree that this is a good idea, I can try doing some implementation. But it's better to ask first and code later, especially if the coding has to be done in C :-)"
Actually the implementation is part of the RCR process. That makes it easier for people to try it out and decide what they think of it. Sometimes an implementation can't be written (as when someone suggests a fundamental syntax change), but in general the approach should be: ask and code at the same time, not ask first and code later.
I submited at [ruby-core:08850] my implementation. Is needed write enum class holding symbols, alowing negation and perhaps make from CONSTANTS constants. Ondrej Bilka
Can't think of a place where it would make sense to do different things if you receive a :socket_o_rdonly, a :io_o_rdonly or, :file_o_rdonly. You could just have :o_rdonly, let the function decide. That's the beauty of symbols, context is everything. It even gives a more "what you expect" feel.
my 2cents