2007-09-06:
[9:07] <rjb> Lonnie: hi & thx for the message[9:08] <Lonnie> Yeah, thanks for the help.[9:08] <rjb> i'll ping you on yahoo IM in a moment[9:08] <Lonnie> ok[10:19] <Lonnie> Did you ever ping me?[10:19] <rjb> yeah[10:20] <rjb> you're shown as unavailable on yahoo IM though[10:50] <rjb> http://helma.pastebin.com/m7b669f75[10:51] <rjb> it kind of matter (for e4x) whether or not you use xbean.jar, even with the most recent rhino release[10:52] <rjb> s/matter/matters/[10:54] <zumbrunn> rjb, good to know[10:55] <zumbrunn> I saw you mentioning it earlier[10:55] <zumbrunn> (but didn't read up on what else I missed on the channel yet, btw)[10:55] <rjb> one note: rhino seems to load xbean.jar when it is present in the same dir as rhino.jar[10:56] <rjb> even if no CLASSPATH is set[10:57] <zumbrunn> you mean if you work with rhino independently from helma?[10:58] <rjb> right, i'm working with a command-line rhino[10:58] <zumbrunn> ok[10:58] <rjb> another behavior that seems counterintuitive to me:[10:58] <rjb> js> var x = <x/>;[10:59] <rjb> js> x.toString(); // empty string returned[10:59] <rjb> js> x.appendChild(<y/>);[10:59] <rjb> js> x.toString();[11:00] <rjb> <x>[11:00] <rjb> <y/>[11:00] <rjb> </x>[11:00] <zumbrunn> more important is what x.toXmlString() returns, though[11:00] <zumbrunn> but I agree[11:00] <rjb> but maybe that agrees with the spec, i'm not sure[11:01] <rjb> yeah that method works as expected (it's toXMLString() btw)[11:02] <zumbrunn> right, that one ;-) ...remembered it wrong[11:02] <rjb> also, one should be aware that typeof(x) returns 'xml'[11:03] <rjb> and not (as someone might expect) 'object'[11:07] <rjb> as for using e4x to build xhtml output, i see 2 issues[11:08] <rjb> 1. you can't use standard html entity references ( and friends)[11:09] <rjb> 2. any empty element is output as a minimized tag (<br/> etc.)[11:10] <rjb> certainly, both have workarounds[11:11] <rjb> (the second one is harder, if you insist you want <br /> with a space, and so on[11:11] <rjb> )[11:13] <Lonnie> Does Helma have an object model similar to the DOM on the server-side, for building markup?[11:17] <rjb> not yet, that i know of[11:18] <rjb> it provides a template language, and hooks for integrating others, in case you don't like its own[11:20] <rjb> (as for entity references, i find it disappointing that e4x chooses to 'deal' with that nasty part of the xml spec by ignoring it exists)[11:20] <rjb> this does mean that a lot of perfectly valid xml that is out there will fail to parse[11:22] <Lonnie> I just printed out all of the Helma documentation. Hopefully, I'll soon understand all the Helma terminology and concepts. I sure wish the documentation was in a pdf with an index.[11:23] <rjb> Lonnie: e4x is precisely a (highly simplified) object model for dealing with xml in ecmascript[11:24] <rjb> it's not a helma thing, and it's not directly used by helma at this time[11:24] <Lonnie> Thanks for point that out; I couldn't quite decipher what you guys were talking about.[11:34] <midnightmonster> rjb, any luck with the entities using xbean? (I would guess not. By XML spec only validating parsers are supposed to understand entities. They're not supposed to work in xhtml-served-as-xml either. Sometimes the specs are stupid.)[11:34] <midnightmonster> i use literals most of the time and numeric entities when I rarely need them.[11:36] <midnightmonster> but I made the switch to utf-8 and using entities almost never before I got to Helma[11:39] <midnightmonster> fwiw, I had to set sourceCharset = UTF-8 in app.properties, and you might find file.encoding = UTF-8 useful as well[11:41] <midnightmonster> (re: my first message, hadn't read the log completely)[11:42] <midnightmonster> about the two XHTML issues, 1) use utf-8 & literals & numeric entity references--they work fine in e4x and all browsers[11:44] <midnightmonster> 1 cont.) failing on named entities is 100% correct according to the XML spec. only validating parsers--which are not the usual case and which I, for one, don't even want--are "allowed" to use them.[11:46] <midnightmonster> 2) Despite the w3c compatibility guidelines, there aren't any important browsers that have a problem with <br/> without the space. I think the last one may have been ns4 if that[11:47] <rjb> i'm ok with what you say on 1. That is a fine workaround.[11:48] <midnightmonster> 3) (a new problem!) but minimizing empty elements is problematic for other reasons. in particular, minimizing <script> and <textarea> totally break HTML browsers[11:48] <midnightmonster> my best workaround so far: <script src="whatever">/* */</script>[11:48] <rjb> (3) == (2) in fact, i had those in the back of my mind as well[11:50] <midnightmonster> <textarea>.</textarea> and xml.toXmlString().replace(/>\.<\/textarea>/g,"></textarea>")[11:50] <midnightmonster> which is not foolproof, but should be correct enough for how I would use it.[11:50] <rjb> kludgy[11:51] <midnightmonster> indeed. a toHtmlString() method would be highly desireable[11:53] <midnightmonster> 4) merely an inconvenience, but for xhtml to validate correctly, you need the right xmlns attribute on the <html>. That means another string replace or using the extra step of having a namespace all the time when working with it[11:53] <rjb> btw i'm not quite sure whether a nonvalidating parser is supposed to fail on named entities, or rather to report them without resolving[11:54] <rjb> it's been a couple of years since i read the specs[11:54] <midnightmonster> XML fails hard by spec. essentially no errors are recoverable.[11:54] <midnightmonster> again, sometimes the specs are stupid[11:54] <midnightmonster> My choice so far has actually been to perform a couple other transforms and output my e4x as html 4.[11:55] <midnightmonster> but toHtmlString and/or toXhtmlString would be really handy. and not hard, I think[11:55] <rjb> i believe a nonvalidating StaX parser reports unresolved entity references and continues, but i might be wrong[11:57] <rjb> that part of the spec is really muddy, by my memory[11:58] <zumbrunn> xml.toXmlString().replace(/<textarea\/>/g,"<textarea></textarea>") wouldn't work?[11:58] <zumbrunn> or that approach with a more complex regex that wouldn't choke on textarea attributes, anyway[11:59] <midnightmonster> the more complex regex is the trick[11:59] <zumbrunn> yep ;-)[12:00] <Lonnie> Wouldn't it be more efficient to just use a java library for building your markup. For each request a java dom object could be created and it would have methods for adding nodes/tags to the markup hierarchy. This would reduce runtime compiling wouldn't it?[12:01] <midnightmonster> <textarea makeYourLifeDifficult="angle brackets in > attributes"/>[12:01] <rjb> and remember there's CDATA sections, though hardly anyone uses them[12:01] <midnightmonster> e4x is actually pdq. and that sounds like hell, lonnie[12:03] <rjb> maybe filtering output through jTidy could solve some of those troubles[12:03] <midnightmonster> CDATA can't go in attributes, can it?[12:03] <rjb> (and likely discover other problems we haven't noticed)[12:03] <midnightmonster> or are we concerned about <textarea> "tags" in cdata?[12:03] <rjb> the latter[12:04] <rjb> like, a transcript of our discussion put inside a CDATA ;-)[12:05] <midnightmonster> Lonnie, e4x is "better" and really different than any other XML interface out there, especially DOM, in that it makes XML elements a native type. You can actually use XML syntax smack in the middle of your code.[12:05] <midnightmonster> it's the power and convenience and the way it does some things exactly right all the time that make the limitations a bit maddening[12:05] <rjb> it does use DOM internally in fact (or am i wrong?)[12:06] <midnightmonster> the java impl does[12:06] <rjb> it just doesn't expose the complete api[12:07] <midnightmonster> i think the c impl does not. or rather, not so similar a DOM as the one JAvaScript exposes that it would be trivial to implement the corssover (b/c they don't)[12:24] <rjb> as for <textarea>, i suppose i'd just stick a single space character inside the element and not worry about it any longer[12:24] <rjb> nobody would notice anyway ;-)[12:28] <rjb> hmm i might actually d/l jtidy.jar and script some tests of e4x output and its transformation[12:30] <midnightmonster> the space gets eaten. doesn't work[12:31] <midnightmonster> "JTidy is a Java port of HTML Tidy" <-- I have little hope, then. Apprently HTML Tidy was once useful. It's been a disaster for every application I've seen it used.[12:34] <rjb> huh? you're right about the space, but i believe that is wrong[12:37] <rjb> it's not ignorable whitespace according to the spec[12:41] <midnightmonster> unfortunately, I think it is[12:41] <midnightmonster> bbl[13:01] <rjb> "An XML processor must always pass all characters in a document that are not markup through to the application."[13:07] <rjb> "Markup takes the form of (...) any white space that is at the top level of the document entity (that is, outside the document element and not inside any other markup)."[13:13] <rjb> (i.e. whitespace inside elements must be passed == is not ignorable)[14:06] <rjb> what makes Tidy a disaster?[14:07] <a2> yes[14:20] <rjb> looks not too bad:[14:21] <rjb> js> importPackage(Packages.org.w3c.tidy);[14:22] <rjb> js> t.XHTML=true;[14:22] <rjb> js> var s = '<html>HTML</html>';[14:23] <rjb> js> var instr = new java.io.ByteArrayInputStream(values(s));[14:23] <rjb> js> t.parse(instr, java.lang.System.out);[14:24] <rjb> --> this outputs a very nice valid XHTML 1.0 Transitional document[14:26] <rjb> DOCTYPE, xmlns and all[14:30] <rjb> (sorry i forgot to mention values() is my function to convert a string into an array of numbers)[14:30] <rjb> (i.e. char codes)[14:31] <rjb> and i forgot, var t = new Tidy();[14:32] <rjb> ok i guess i should be using the pastebin, but there's not much chatter on this channel anyway[14:35] <zumbrunn> you didn't paste code, you pasted command lines ;-)[14:38] <zumbrunn> rjb, does tidy happen to "remaximize" the script and textarea tags minimized by e4x?[14:40] <rjb> checking now[14:41] <rjb> uhh not quite[14:42] <rjb> <textarea/> goes to <textarea />[14:43] <rjb> and <script src="foo"/> to <script type="text/javascript" src="foo" />[14:43] <rjb> not really good[14:43] <zumbrunn> maybe that could be considered a bug in tidy[14:44] <rjb> Tidy does have a bunch of config variables though that i haven't dug into yet[14:44] <rjb> so it's too early to say[15:00] <midnightmonster> this is one example of why Tidy seems to me to be one large bug to me. "corrects" things that aren't errors (like the script type), fails to correct almost anything you care about.[15:01] <midnightmonster> I hope there is a correct-er config though that will give the results you want[15:08] <midnightmonster> rjb, the XML processor has to make all the whitespace available to the application, but it doesn't say what the application has to do with it afterward. e4x *does* make all whitespace available to the application. E.g., (<b> </b>).text()==" "'[15:08] <midnightmonster> but it doesn[15:08] <midnightmonster> 't preserve some of it when rendering, which depending on your application is a bug for the application, but not an XML spec violation[15:16] <midnightmonster> anyway, whether e4x is a processor or an application or both isn't 100% obvious to me, but I think the actual behavior of rhino's e4x wrt collapsing whitespace is similar to the e4x spec, at least as far as the current problem[15:18] <rjb> i'm sorry but that's not what i'm seeing[15:19] <rjb> (<p> </p>).text().toXMLString()==" "; --> false[15:19] <rjb> (<p> </p>).text().toXMLString().length; --> 0[15:19] <midnightmonster> b/c you're using toXMLString[15:20] <midnightmonster> which goes through e4x's XML serializing algorithm[15:20] <rjb> same with toString()[15:20] <midnightmonster> but (<p> </p>).text().toString()==" "[15:20] <midnightmonster> oh really? hmmm[15:20] <rjb> (<p> </p>).text().toString().length; -->0[15:21] <midnightmonster> I get 1[15:21] <midnightmonster> in spidermonkey and rhino[15:22] <rjb> mine is rhino1.6R7 with xbean.jar[15:23] <rjb> and my point of view would be that the XML object is the processor, and the script that calls into it is the application[15:23] <midnightmonster> mine is whatever rhino is now in helma cvs without xbean.[15:23] <midnightmonster> yes, but toXMLString() is an application as well[15:24] <midnightmonster> b/c it doesn't provide access to the structure, it does something with it[15:24] <rjb> w/o xbean some behavior changes, like DOCTYPE throwing an exception[15:24] <rjb> which is hardly desirable btw[15:24] <midnightmonster> yeah. I now about doctype[15:24] <midnightmonster> know[15:26] <rjb> (new XML(s)).toXMLString() should round-trip[15:26] <rjb> assuming that s is well-formed[15:27] <rjb> except perhaps for ignorable whitespace[15:47] <midnightmonster> here's something wacky: since the update, my rhino/e4x actually is preserving whitespace.[15:47] <midnightmonster> in toXMLString()[15:48] <midnightmonster> (without xbean)[15:48] <midnightmonster> (<p> </p>).toXMLString()=="<p> </p>"[15:48] <zumbrunn> cool[15:50] <rjb> update, to?[15:52] <midnightmonster> the one in helma cvs[15:52] <midnightmonster> which should be 1.6r7, iiuc[15:54] <midnightmonster> I just add the doctype on at the end (a *much* cheaper operation than .replace, let alone running tidy)[15:55] <midnightmonster> (the end being chronologically, not end of the string)[15:55] <rjb> that is not what i'm seeing[15:55] <midnightmonster> but you're using xbea, right?[15:55] <rjb> (<p> </p>).toXMLString(); --> <p/>[15:56] <rjb> both with and without xbean[15:56] <rjb> weird[15:56] <midnightmonster> (or another way): (new XML(myp = "<p> </p>")).toXMLString()==myp[15:57] <midnightmonster> spidermonkey does *not* do this[15:58] <rjb> ok gotta run; later[21:10] <rjb> well i'm afraid jtidy doesn't seem to implement C.3. of http://www.w3.org/TR/xhtml1/#guidelines[21:11] <rjb> (that concerns the <textarea/> and <script/>)[22:06] <rjb> interesting: konqueror has no problem with <textarea />[22:07] <rjb> displays it just fine[22:09] <rjb> the WDG validator says it's okay[22:09] <rjb> firefox gets confused though[22:35] <rjb> https://bugzilla.mozilla.org/show_bug.cgi?id=171462[22:43] <rjb> ok, <textarea/> will render correctly in firefox IF it understands the file is xhtml+xml[22:44] <rjb> that will happen when the file is served with that content type[22:44] <rjb> or, if a local file, if its name is *.xhtml[22:45] <rjb> an xhtml DOCTYPE isn't enough if content-type header says text/html
In the channel now:
Logs by date: