HP Forums
Embedded code blocks and Firefox non-breaking spaces - Printable Version

+- HP Forums (https://www.hpmuseum.org/forum)
+-- Forum: Not HP Calculators (/forum-7.html)
+--- Forum: Forum Issues and Administration (/forum-19.html)
+--- Thread: Embedded code blocks and Firefox non-breaking spaces (/thread-19207.html)



Embedded code blocks and Firefox non-breaking spaces - DavidM - 11-26-2022 01:46 AM

Just noticed something strange with embedded code blocks when using Firefox as the browser:

If you copy the text in a code block by selecting it and performing "copy", the spaces in the source seem to be getting converted to non-breaking spaces when you paste it into an application. This wreaks havoc if you are trying to paste the source into something like EMU48.

As far as I can tell, this does not happen with Chrome or Edge, and I haven't tried other browsers yet to see what happens. I've seen a couple posts in other places about this being a result of a recently "fixed" bug in Firefox, so I'm reasonably certain it is the culprit.

Just curious as to whether this is a known issue, and if there are any known workarounds (other than "don't use Firefox"). Smile


RE: Embedded code blocks and Firefox non-breaking spaces - rprosperi - 11-26-2022 03:05 AM

As it appears to be a bug in FF, there likely is no easy workaround, about all one could do is dump the copied text into an editor and change them back to breaking spaces, then copy/paste again. Annoying to be sure, but my experience is the Mozilla folks tend to fix most reported bugs pretty quickly, and buggy bug-fixes typically even faster.


RE: Embedded code blocks and Firefox non-breaking spaces - SammysHP - 11-26-2022 08:52 AM

I found this article that summarizes the problem: https://utcc.utoronto.ca/~cks/space/blog/web/FirefoxNonbreakingSpacesCopyIssue


RE: Embedded code blocks and Firefox non-breaking spaces - Didier Lachieze - 11-26-2022 09:07 AM

(11-26-2022 01:46 AM)DavidM Wrote:  Just curious as to whether this is a known issue, and if there are any known workarounds (other than "don't use Firefox"). Smile

Does this happens also when copying from the Printable Version of the page?


RE: Embedded code blocks and Firefox non-breaking spaces - Thomas Klemm - 11-26-2022 10:52 AM

(11-26-2022 01:46 AM)DavidM Wrote:  If you copy the text in a code block by selecting it and performing "copy", the spaces in the source seem to be getting converted to non-breaking spaces when you paste it into an application.

The non-breaking space characters are already in the HTML-sources.
I used 10 space characters of each kind between two | characters both in a code-block and with a monospaced font:

Code:
SPACE: U+0020
|          |

NO-BREAK SPACE: U+00A0
|          |

EN SPACE: U+2002
|          |

em SPACE: U+2003
|          |

font-family: Courier

SPACE: U+0020
| |

NO-BREAK SPACE: U+00A0
| |

EN SPACE: U+2002
|          |

EM SPACE: U+2003
|          |

To me it boils down to:
  • only in code blocks SPACE and NO-BREAK SPACE are replaced by  
  • in ordinary text they are left as is
  • EN SPACE and EM SPACE are always left as is
Browsers squash multiple SPACE or NO-BREAK SPACE characters to a single one.
Thus these characters can not be used to format code in ordinary text.
However both EN SPACE and EM SPACE can be used.
But their length is the same as an ordinary SPACE.
At least in a monospaced font like Courier.

I recommend to view the page sources of this post.

(11-26-2022 09:07 AM)Didier Lachieze Wrote:  Does this happens also when copying from the Printable Version of the page?
Yes.


RE: Embedded code blocks and Firefox non-breaking spaces - DavidM - 11-26-2022 01:47 PM

(11-26-2022 10:52 AM)Thomas Klemm Wrote:  
font-family: Courier

SPACE: U+0020
| |

NO-BREAK SPACE: U+00A0
| |

NO-BREAK SPACE: U+00A0 (converted to UTF-8 in Notepad++)
|          |

EN SPACE: U+2002
|          |

EM SPACE: U+2003
|          |


I added one item to your list above. It actually renders as a standard-width space when a mono-spaced font is used, so it can be used to keep indentation aligned (unlike the en- and em-space). Here's the same text, non-quoted:

font-family: Courier

SPACE: U+0020
| |

NO-BREAK SPACE: U+00A0
| |

NO-BREAK SPACE: U+00A0 (converted to UTF-8 in Notepad++)
|          |

EN SPACE: U+2002
|          |

EM SPACE: U+2003
|          |


I wonder if this issue would go away if we could somehow place the source code in an HTML <pre>...</pre> block. I don't believe MyBB supports that, though.


RE: Embedded code blocks and Firefox non-breaking spaces - Thomas Klemm - 11-26-2022 02:24 PM

It appears that I messed up the NO-BREAK SPACE part and instead just inserted SPACE instead.
Sorry for the confusion.
Also I've created another post NO-BREAK SPACE: U+00A0.


RE: Embedded code blocks and Firefox non-breaking spaces - Thomas Klemm - 11-26-2022 02:40 PM

(11-26-2022 01:47 PM)DavidM Wrote:  I wonder if this issue would go away if we could somehow place the source code in an HTML <pre>...</pre> block. I don't believe MyBB supports that, though.

I've created an HTML-page with the following body:
PHP Code:
<pre>
|          |
|&
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|
|
          |
</
pre
The first line contains ordinary and the last line contains non-breaking spaces.

Then I started the Python web-server with:
Code:
python3 -m http.server

In Firefox I copied the page and this looks good:
the first line contains ordinary spaces and the other two contain non-breaking spaces.
Also the indentation looks fine, even in the first line.


RE: Embedded code blocks and Firefox non-breaking spaces - DavidM - 11-26-2022 03:16 PM

It appears that this issue has been discussed by the Mozilla developers in various forms for the last 16 years.

Call me silly, but I just looked at the release notes for the current release of Firefox, and found this item listed as "Unresolved":
Quote:In order to better support certain typographical conventions, Firefox now preserves some non-breaking spaces when copying text to the clipboard instead of changing them to regular spaces.

The new behavior is known to cause problems when non-breaking spaces are used for indentation.

So there may be future changes to how they're handling things.


RE: Embedded code blocks and Firefox non-breaking spaces - Giuseppe Donnini - 11-27-2022 02:23 AM

(11-26-2022 01:46 AM)DavidM Wrote:  If you copy the text in a code block by selecting it and performing "copy", the spaces in the source seem to be getting converted to non-breaking spaces when you paste it into an application. This wreaks havoc if you are trying to paste the source into something like EMU48.

Hi David,

I ran into the same problem two years ago, but since I didn’t particularly like the idea of having to sit there and wait for someone else to fix it, I decided to solve it—as best I could—on my end of the line, i.e. inside the HP-48.

So, here is my little utility I came up with back then. It simply replaces a specific character with another one throughout a given string. Written in machine code, it is blazingly fast, replacing ca. 20,000 characters per second on a real, non-emulated HP-48GX (I just tried it with a giant string of 150,000 No-Break Space characters: it took only 7.33 seconds to convert them all to normal SPACE characters). So, in any modern emulator, the result should be instantaneous.

Since the resulting string is necessarily of the same length as the given string, the replacement can be done in-place, that is, without first making a copy of the original string—unless the latter is referenced elsewhere in the system. In that case, it is wiser to make a copy before carrying out the replacement. Therefore, if your input string requires more than half of the memory available, you should make sure there are no references to it other than the one on the data stack. This can be achieved by turning off the LAST STACK (a.k.a. UNDO), LAST ARG, and (potentially) LAST CMD features.

Below you will find the source code of the program (by the way, with plenty of No-Break Space characters in it, so you could later feed it to its own compilation, if you want). I called it REPL¢ (for REPLace a character), but you can rename it the way you want since it is the non-library version of the code I’m presenting here. A compiled version (for any HP-48 model) is attached to the posting.


*******************************************************************************
** REPLchr  (Non-Library Version)
*******************************************************************************
** NAME     : The intended User RPL name is REPL\162, where decimal ASCII code
**            162 (A2h) represents the CENT SIGN.
**
** ABSTRACT : Replaces a specific character with another one throughout a given
**            string.
**
** STACK    : ( $ %chr1 %chr2 --> $' )
**
** ERRORS   : "Too Few Arguments"  : The command required more arguments than
**                                   were available on the stack.
**            "Bad Argument Type"  : One or more stack arguments were of
**                                   incorrect type.
**
** DETAIL   : Exchanges the character whose decimal ASCII code is specified by
**            the real on stack level 2 (%chr1) with the character whose
**            decimal ASCII code is specified by the real on stack level 1
**            (%chr2) throughout the string specified on stack level 3 ($).
**
**            Since the resulting string ($') is necessarily of the same length
**            as the given string, the replacement can be done in-place, that
**            is, without first making a copy of the original string--unless
**            the latter is referenced elsewhere in the system.  In that case,
**            a copy is made before carrying out the replacement.
**
**
** HISTORY  : DATE         PROGRAMMER         VERSION / MODIFICATIONS
**            ----------   ----------------   -------------------------------
**            2020.04.21   Giuseppe Donnini   1.0 - First Implementation.
**            2020.12.21                      1.1 - Added System RPL shell;
**                                                  Compacted code.
**            2022.11.26                      1.2 - Revised documentation.
*******************************************************************************
*******************************************************************************
::                ( $ %chr1 %chr2 --> )
  0LASTOWDOB!     ( *Clear any previously saved command name* )
  CK3NOLASTWD     ( *Require 3 arguments; don't look for command name* )
  CK&DISPATCH1    ( *Check argument type; allow 2nd pass for tagged objects* )
  STRREALREAL     ( *Require 1 string and 2 reals* )
  ::
    COERCE2       ( $ #chr1 #chr2 )
    ROT           ( #chr1 #chr2 $ )
    CKREF         ( *Make new copy if string is referenced elsewhere* )
    UNROT         ( $ #chr1 #chr2 )

CODE
        GOSBVL  =POP2#      * Pop value of #chr2 / #chr1 into C[A] / A[A].
        R0=A                * Store ASCII code of chr1 in R0[A].
        R1=C                * Store ASCII code of chr2 in R1[A].
        GOSBVL  =SAVPTR     * Save RPL variables in reserved system RAM.
        C=DAT1  A           * Read address of $.
        D1=C                * Point to $.
        D1=D1+  5           * Point to length field of $.
        A=DAT1  A           * Read length field value.
        D1=D1+  5           * Point to body of $ (start position of searching).
        D0=A                * We use D0 as a counter.
        D0=D0-  5+2         * Compute core length of $ in nibbles (-5);...
*                           * ...decrement counter by 1 char = 2 nibbles (-2).
        A=R0                * Recall ASCII code of chr1.

loop    C=DAT1  B           * Read current character within $.
        ?C#A    B           * Does it match chr1?
        GOYES   skip        * No, then just keep searching.

        A=R1                * Yes, then recall chr2,...
        DAT1=A  B           * ...overwrite current character with chr2,...
        A=R0                * ...and restore value (chr1) of register A[A].

skip    D1=D1+  2           * Point to next character within $.
        D0=D0-  2           * Decrement counter by 1 character, i.e. 2 nibbles.
        GONC    loop        * Reloop until end of $ is reached.

        GOVLNG  =GETPTRLOOP * Restore RPL variables & pass control back to RPL.
ENDCODE
                  ( --> $' )
  ;
;


Now, going back to your specific problem (and mine), all we have to do is:

1. Paste the string into Emu48.
2. Push the reals 160 (decimal ASCII code of NBPS) and 32 (decimal ASCII code of SPACE) onto the stack.
3. Execute REPL¢.

[ Of course, the source code could contain some NBSP characters that are intended as such, e.g. as part of an encoding string. But in that case, there would be no automated solution to the problem, anyway. ]


RE: Embedded code blocks and Firefox non-breaking spaces - rprosperi - 11-27-2022 03:13 PM

Thanks for sharing this Giuseppe! Smile

It's an excellent introductory example of using SysRPL w/assembler on a 48. Its narrow scope and great documentation make it easy for a noob to understand what it's doing and how, and likely will make a great starting-place framework for experimenting.


RE: Embedded code blocks and Firefox non-breaking spaces - DavidM - 02-01-2023 12:10 AM

I'm not sure exactly what changed to fix this issue, but copying and pasting from a code block using the latest version of FireFox seems to be using regular spaces (20h) once again instead of non-breaking spaces (A0h).

This definitely makes things easier when browsing with FireFox and trying out code segments with an emulator.