My argument for this is that it keeps the rules simple, and is flexible but still constrained.
Votes? And if you vote No/-1, a short reason please.
-1 because I think that date! and time! values should be normalized in a data exchange format. Though, specifying such rules would make the spec significantly more complex, but not doing it would just shift the burden onto the implementers, which might result in different interpretations. For the duration use-case, I am for using a different format than time!, maybe resulting in a new datatype.
(I am taking the perspective of people implementing REN in non-Rebol languages)
How does the above proposal make it harder for non-Redbol languages?
For duration, are you thinking something like "1h2m3s"? If so, that kind of leads to a general units notation.
One of my thought experiments, as you may know, was to reduce the total number of datatypes Ren understands, to help facilitate adoption. Hence the implied-string type in the current spec, which covers the idea of any-string! to some extent. I considered the same thing with the segmented number type, which could be generalized to cover tuple!, time!, and pair!/point!. The issue there is that I didn't think it would be good to say [number! sep number! ...] and try to fit in exceptions.
And I wish there were easy answers. We know that we can loosen restrictions later, but can't make things more stict. I think we can safely assume that the harder we make it to write a conforming parser, the more likely it is that people will take shortcuts. e.g., we specify 00-59 for seconds but parsers use [2 digit=] as a rule, like RFC3339.
Where is that nice clear line between syntax and semantics? How do we allow people to use Ren in ways that may not be clear to us right now, while disallowing things that make no sense at all?
Also, complexity is mitigated if we required hour, minute, and int part of seconds to be fixed at 2 digits, if we go down the stricter path.
And do we favor an informative spec with the appearance of simplicity over a normative spec that leverages existing rules. e.g., do we say a time is the same whether standalone, part of a date, or as a date timzone, or each has its own rules?
As I stated earlier, there could be more time systems possible, with a semi-serious hint to the Mars One expedition that REN should be able to handle. The thing is in interfacing sometimes there is additional information that is not in the format itself. Your example of the seconds 0-59 (or 00-59) and shortcutting quick test of that testing on 1 or 2 digits from 0 to 9. So the question is does REN need to force all information in a certain specific format? Or should it be less strict. (I feel like I said the same as your last comment now, in other words)
Other popular langs (JS, Java, .NET, Go, Python, Ruby) distinguish between an instant in time (datetime) and duration (period/timespan/timedelta) and, as you might imagine, there is no consensus on how things work.
An instant in time is absolute, a duration is relative.
If you use DIFFERENCE on dates in REBOL you get a time! value. If you use SUBTRACT, you get a number of days. Given two dates, 1 day apart, you respectively get 24:00 and 1. Both are relative but the latter loses information. Fully qualified, if we didn't have separate date! and time! values, they could be viewed as 0000-00-00/24:00:00 or 0000-00-01/00:00:00.
REBOL doesn't allow literal dates with a month or day of 00. Dates are absolute, and standalone time values are relative. The timezone part of a date is also relative but limited to absolute Earth-day hour values (except that REBOL does a modulo 32:00 thing with it). Relative vs absolute in REBOL is implicit.
Do we need to explicitly distinguish between relative and absolute date-times in Ren?
If so, Doc is right that we should limit absolute time part values, and that we need a new notation for relative times (e.g., yyyy:mm:dd/hh:mm:ss.s, yyyy-mm-dd+hh:mm:ss.s, etc.), which still doesn't solve the Mars problem. Maybe John (Geomol) will weigh in with astronomical thoughts on that. :)
Let's simplify the problems to solve to Earth time only, once men on Mars will be a reality, we can eventually try to push NASA to use Ren. ;-)
"How does the above proposal make it harder for non-Redbol languages?"
Because you need to make some assumptions about how to process the Ren data, the spec being too fuzzy. If you take this valid Ren time: 987:654:321, first, it fails to fit into the Ren-as-human-readable-format goal, then you need to store that into memory. Some Ren client developers might make the (a priori obvious) choice to use 8-bit slots to fit hour, minutes and seconds. In such case, they need to do some calculations to make it fit, which, in this example case, will result in an overflow for hours, and the spec does not say anything about what to do with that. Storing in memory each time component as 32-bit values is obviously a waste of space, so this is not satisfying. Moreover, some languages might already have some time supporting libraries which would not fit the choice made in Ren (because Ren would not stick to the standard human time format).
If we want Ren to be for humans, we should stick with formats humans can read and make sense of.
That said, limiting values to 2 digits + adding a mention in the spec that integers should be limited to respectively 24, 59, 59 would already be a big improvement IMHO.
"For duration, are you thinking something like "1h2m3s"? If so, that kind of leads to a general units notation."
Exactly yes, but even if I would be very glad to have a unit! datatype, it's a too big design work to open it now. I would stick to an ad-hoc literal format for durations in the form of _c_y_h_m_s to begin with. Given that we use a lot of sub-second durations in computing, it could also include _ms_us_ns. But this is a wider than Ren topic, I am thinking about a duration! type for Red since a while, but didn't have time to design it yet. Also, as you point out rightly, I would need to also consider a more general unit! datatype.
In Ren goals, there is: "Be easy for humans to read and write". If you allow to write date/times as implicit computations, like: 05-31-2015/100:32:06, it would be equivalent to allow writing integers as, e.g., 10 * 8 + 6 - 4 * 7 instead of 574. Ren is a data exchange format, it's not the Rebol console. :-)
I am not fond of having string format covers any-string!. IMHO, it reduces the value of Ren compared to other formats. Removing the quotes for tags, email, url,... is one of the great things of the Rebol format, I think Ren should follow the same path. If you want to compete with JSON wrt the spec size and complexity, you risk ending up crippling Ren, and reducing its added value.
Sorry for some bad grammar in my above posts, it was written before my morning coffee. :-)
"Storing in memory each time component as 32-bit values is obviously a waste of space, so this is not satisfying."
This, I would say, is not Ren's concern, and somewhat akin to saying ints should be 16 bits.
But I think we're largely on the same page otherwise. :-)
I started a Ren parser, and if we're OK saying that the time part of a date, and timezone as well, are limited, then all we have to fight about is standalone time values (relative time). Because if I can only say 23:59:59, maximum, not only do I lose meaning, but I'm limiting the relative range I can talk about. Strictly speaking, I can't even say 24:00, right?.
This does add some complexity, though it's not bad as long as we require a fixed number of digits in each segment.
Note that my implied-string is separate from a quoted or braced string. No quotes. The one thing I don't cover yet is tags, which is tougher. It's pending question about whether they can be considered just another kind of string, and can we also present parens as just another list synatx along with square brackets. I was thinking of leaving them out of v1, having a page of proposals, and adding them later.
"This, I would say, is not Ren's concern, and somewhat akin to saying ints should be 16 bits."
Well, I understand your perspective, but you should realize that even R2 and R3 could not support Ren as-is if you stick to 32-bit time components.
A special case could be made for 24:00 value, though, I think that it is not worse it. If you want to represent just hours, you can use integers for that.
worse => worth
Timezones are another tricky area. I believe the range of timezones is -12:00 to + 14:00. Most timezone are either 0 minutes or 30 minutes. However, there is Chatham Islands (New Zealand) at +12"45, Eucla (Australia) +8:45, Nepal +5:45. None at 15 minutes past the hour though.
There are always going to be limits in implementations, no doubt. But R2 loads 99999:99999999:999.999 just fine. R2 also does some crazy things with signed times and overflow, but that hasn't stopped it from being useful 99.9999999% of the time(s). ;-)
Also, R3 gives a different result than R2 for the above time value.
A compromise would be to set a limit that could be represented in most langs with reasonable efficiency. And look at it the other way. How many systems use epoch seconds, ms, or 100 ns as time units? We need to be able to represent those, correct? If not, you end up with integers, as you say, which provide no meaning. We effectively cripple time.
I don't worry that people will generate huge values from other systems, because those have their own limits as well, so the chances of someone generating a time like the above is virtually nil. My concern is security and opening doors for overflow attacks and such.
Again, I'm OK with limits for absolute time values, as in date-time. Timezones are an issue, which are a constrained relative time, and I'm good with limiting those as well. Right now my parse rules allow any hour 00-23, which we can address limits of separately if we want.
I thought there was something at a 0:15 offset, but it's easy to remove.
In order to allow large relative times to be represented, we either need to allow times to contain large segment values and/or extend the notion of time to include duration as well, with a new notation.
Do not limit timezone at 15 minutes. Timezones should have at least 1 minute resolution.
Even though they aren't recognized as valid? Noting that I agree with you :-), why so?
"Valid" meaning "used in the world today".
The reason I'm OK with the 0:15 limitation in v1 is that we can later relax the constraint without breaking data. In general, I want to let people express their data, even if it may not make sense to us right now. We say what a type of value looks like, but impose few semantic restrictions.
Timezones are crazy stuff, there were not quarter hour based timezones in the past and there's no reason to not expect them in the future.
TZ - Agreed, it has to be treated as filter/list that will change, and piss us all off. (Windows changed the defaults between Windows 7 and 8, names of cities also change, and so does the geographical location of the splits)
Go: A Duration represents the elapsed time between two instants as an int64 nanosecond count. The representation limits the largest representable duration to approximately 290 years.
ParseDuration parses a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
func Date func Date(year int, month Month, day, hour, min, sec, nsec int, loc *Location) Time
Date returns the Time corresponding to yyyy-mm-dd hh:mm:ss + nsec nanoseconds in the appropriate zone for that time in the given location. The month, day, hour, min, sec, and nsec values may be outside their usual ranges and will be normalized during the conversion. For example, October 32 converts to November 1.
ISO8601: For example, "P3Y6M4DT12H30M5S" represents a duration of "three years, six months, four days, twelve hours, thirty minutes, and five seconds".
To resolve ambiguity, "P1M" is a one-month duration and "PT1M" is a one-minute duration (note the time designator, T, that precedes the time value). The smallest value used may also have a decimal fraction, as in "P0.5Y" to indicate half a year. This decimal fraction may be specified with either a comma or a full stop, as in "P0,5Y" or "P0.5Y". The standard does not prohibit date and time values in a duration representation from exceeding their "carry over points" except as noted below. Thus, "PT36H" could be used as well as "P1DT12H" for representing the same duration.
Java: A Duration object is measured in seconds or nanoseconds and does not use date-based constructs such as years, months, and days, though the class provides methods that convert to days, hours, and minutes. A Duration can have a negative value
To define an amount of time with date-based values (years, months, days), use the Period class...The total period of time is represented by all three units together: months, days, and years.
Ruby: In versions prior to Ruby 1.9 and on many systems Time is represented as a 32-bit signed value describing the number of seconds since January 1, 1970 UTC, a thin wrapper around a POSIX-standard time_t value, and is bounded
Since Ruby 1.9.2, Time implementation uses a signed 63 bit integer, Bignum or Rational. The integer is a number of nanoseconds since the Epoch which can represent 1823-11-12 to 2116-02-20. When Bignum or Rational is used (before 1823, after 2116, under nanosecond), Time works slower as when integer is used.
Duration object is stored as seconds.
.NET: A TimeSpan object represents a time interval (duration of time or elapsed time) that is measured as a positive or negative number of days, hours, minutes, seconds, and fractions of a second. The TimeSpan structure can also be used to represent the time of day, but only if the time is unrelated to a particular date.
The largest unit of time that the TimeSpan structure uses to measure duration is a day. Time intervals are measured in days for consistency, because the number of days in larger units of time, such as months and years, varies.
The value of a TimeSpan object is the number of ticks that equal the represented time interval. A tick is equal to 100 nanoseconds, or one ten-millionth of a second. The value of a TimeSpan object can range from TimeSpan.MinValue (The string representation of this value is negative 10675199.02:48:05.4775808, or slightly more than negative 10,675,199 days.) to TimeSpan.MaxValue (The string representation of this value is positive 10675199.02:48:05.4775807, or slightly more than 10,675,199 days.).
TimeDelta - Only days, seconds and microseconds are stored internally...Note that normalization of negative values may be surprising at first.
- The most negative timedelta object, timedelta(-999999999). - The most positive timedelta object, timedelta(days=999999999, hours=23, minutes=59, seconds=59, microseconds=999999). - The smallest possible difference between non-equal timedelta objects, timedelta(microseconds=1). - seconds: Between 0 and 86399 inclusive
Just for fun:
- Planck time: 0:0:0.0000000000000000000000000000000000000000000054 - Age of universe: 0:0:432'000'000'000'000'000
I know this is a long discussion but, it's such an important element, I think it's worth it.
- I don't like the ISO8601/iCal/Go format for durations. e.g., "P3Y6M4DT12H30M5S" - I DO want a dialect or lexical that lets us express units, but now is not the time, as Doc says. - I think we should support large relative date-time values.
Also, I should note that I use Doc's scheduler library, and extended it a bit. The period/interval dialect there works well in that context.
Rebol's solution to relative time values is to use two datatypes: time!, allowing large non-sexagesimal values and normalizing them on load, and integer! which is the number of days between two dates. This maps reasonably well to the .NET and Python approaches, and makes sense if we want unambiguous offset values (because month and year lengths vary). That doesn't mean we can't support separate YMD values in a lexical format, just that we need to say what they mean to Ren.
I shouldn't say non-sexa* values, as it does treat them as such. I mean overflow values.
"we need to say what they mean to Ren." By this, I mean is 4 years the same as 1460 days, 1461 days, or neither and the math is up to you?
Basically, constrain absolute values, don't allow timezone on relative values, and require a sign on relative date and date-time values to distinguish them from absolute. Relative time values do NOT require a sign, with the expectation that time-of-day would not be an independent value type.
Is this getting us closer or further away? It doesn't answer constraint questions, like a 4 digit limit on abs years, large values in relative parts, or minimizing overhead. Does it need to?
Does anyone object to this group being [web public]?
web-public: Fine with me.
The absolute / relative date/time proposal up on ren-data.org looks quite good. A few thoughts after first digesting it for a bit:
Standalone times as relative times: +1
Basic constraints for month and day value ranges in date and time-of-day/seconds: +1
Standalone relative dates don't strike me as something particularly worthy of providing, at first glance.
I also think that the forced sign looks rather awkward for negative rel-dates: -1-1-1, -10-0-1. Further, I think that date arithmetic with relative dates may be _really_ awkward.
So I'd suggest limiting any-data-time to just abs-date-time or abs-date or rel-time, for starters.
Thanks Andreas! Good feedback. I agree that negative rel-dates aren't pretty. When I wrote examples for my test parser, I used 4 digits for all years, which made them less ugly (-0000-01-00). We could easily require a minimum of 4 digits. Still not perfect, but perhaps better. I also agree that date math is tricky (all around), but Ren doesn't have to do the math. :-) For me, it's a question of "how else do we do it?" if relative dates are useful?
Tough call, as I think there *is* value in relative dates. Some of that value comes from them being imprecise, which conveys meaning as humans sometimes do.