Closed Bug 471579 Opened 16 years ago Closed 15 years ago

reftests timeout very frequently when run on maemo device

Categories

(Testing :: Reftest, defect)

ARM
Maemo
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: jmaher, Assigned: jmaher)

References

Details

(Keywords: fixed1.9.1, Whiteboard: [fixed1.9.1rc3])

Attachments

(2 files, 5 obsolete files)

I get a lot of reftest timeouts when running on the maemo platform to test fennec.  I increased the timeout from 10000 (ms) to 60000 (ms) and only had 4 timeouts.  This is no surprise since the maemo platform runs so slow (mochitests are almost 10 times slower).

I am posting this bug to find an acceptable workaround to this problem.  One method that comes to mind is if we are building for mobile or for the arm architecture, we adjust the value in reftest.js during buildtime.  Another idea is to make this dynamic and either by cli flag or some internal configuration we can adjust the timeout using the original 10000 as the default value.
I modified the timout to be 90000 (ms) and no tests timed out.  I had the slowest test at 87413 in 3 full test runs with others hitting ~72000 and ~66000.  So I would like to make a new limit for the maemo device at 90000.  

When it comes to windows mobile, we might have to adjust this further which all leads down the path of a way to configure this at run time in some fashion.
it would be nice to add this to the reftest code.  I don't know if we can fix this at build time when we generate the reftest.jar file.
We could make LOAD_FAILURE_TIMEOUT not a const, but initialized at startup, and we could use the value of the pref reftest.timeout if set, or the default if not. Then we could add a --timeout=X param to runreftest.py which would set that pref in the testing profile.
I like the idea of a cli variable.  It is easy to modify and keeps things consistent.  It get confusing when you have to set environment variables, edit other files, and use cli parameters.
Is there a reason to not just bump this to 120 seconds, say, across the board?
That means that if you fail, you fail very slowly, thus making unittest runs take longer than they already do.
If you fail due to timeout, sure.  But fails due to "real" timeouts on reftest are pretty rare...
Blocks: 475778
Blocks: 462889
ted, is this something you would like to offer advice on or maybe a patch?
I don't have time to work on it at the moment, but I think my suggestions in comment 3 stand.
Attached patch mobile reftest 1.0 (obsolete) — Splinter Review
I ended up fixing two issues here.

1) reftest timeout.  I have a cli option to runreftest.py (--timeout=x) which will set a user pref "reftest.timeout" if it is an int > 100.  This gets used in reftest.js when timing out a test during the onload

2) reference to XPCOMABI during readManifest is causing reftest to crash on Fennec.  This is resolved with a try/catch clause failing back to a call with no reference to XPCOMABI.  

With both of these changes in this patch, I can run reftests on mobile.
Attachment #380328 - Flags: review?(ted.mielczarek)
Comment on attachment 380328 [details] [diff] [review]
mobile reftest 1.0

>+    try {
>+      sandbox.xulRuntime = {widgetToolkit: xr.widgetToolkit,
>+                            OS: xr.OS,
>+                            XPCOMABI: xr.XPCOMABI};
>+    }
>+    catch(e) {
>+      sandbox.xulRuntime = {widgetToolkit: xr.widgetToolkit,
>+                            OS: xr.OS};
>+    }


Seems like it would be simpler if this were just:

sandbox.xulRuntime = {widgetToolkit: xr.widgetToolkit, OS: xr.OS};
try {
  sandbox.xulRuntime.XPCOMABI = xr.XPCOMABI;
} catch(e) {}


It seems pretty broken that configure isn't defining TARGET_XPCOM_ABI, though.
dbaron, your approach is cleaner, simpler, and achieves the same thing.  Thanks for posting it.

I suspect there are other reasons we have not supported the ARM platform and/or maemo/wince for XPCOMABI.  Maybe Ted knows a bit more about that.
We only set TARGET_XPCOM_ABI for things where we know there's a defined ABI. bsmedberg and I don't know enough about ARM to make a definitive statement about the ABI there. We certainly would take a patch from someone who knows better.

Joel: dbaron will need to review the reftest.js parts of your patches, I can review the runreftest.py parts.
Attached patch mobile reftest 1.1 (obsolete) — Splinter Review
updated patch to include dbaron's recommendation for ignoring XPCOMABI.
Assignee: nobody → jmaher
Attachment #380328 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #380409 - Flags: review?
Attachment #380328 - Flags: review?(ted.mielczarek)
Attachment #380409 - Flags: review?(ted.mielczarek)
Attachment #380409 - Flags: review?(dbaron)
Attachment #380409 - Flags: review?
Comment on attachment 380409 [details] [diff] [review]
mobile reftest 1.1

>-const LOAD_FAILURE_TIMEOUT = 10000; // ms
>+var gTimeoutValue = 0;

Could you call this "gLoadTimeout" instead of "gTimeoutValue"?

>+    // for processors we don't have full ABI support for (ARM)

This comment should say something more like:
  // getting xr.XPCOMABI throws on configurations we haven't chosen an ABI name for

r=dbaron on the reftest.js part with those changes
See also bug 479518 for increasing the default (e.g. desktop/tinderbox) timeout.
Comment on attachment 380409 [details] [diff] [review]
mobile reftest 1.1

+def createReftestProfile(profileDir, timeout):

I think I'd prefer it if you just passed the options object down to createReftestProfile, and let it check .timeout internally. That way if we want to add more prefs, we don't have to add more parameters to this method.

+  parser.add_option("--timeout",              
+                    action = "store", dest = "timeout",     
+                    default = 10000,
+                    help = "reftest will timeout in specified #milliseconds, default 10000 (10 seconds)")

You can have the default interpolated into the help string with %default, see:
http://docs.python.org/library/optparse.html#generating-help

+  # we have to have a timeout as an int and since this is ms, >100
+  if int(options.timeout) <= 100:
+    options.timeout = -1

I don't get the point of this. If someone sets a ridiculously low timeout, they're only hurting themselves. (Also, 100ms seems just as likely to break tests as <100ms.) I'd just drop this check.

Also, per bug 479518, why don't we bump this to 2 or 3 minutes while you're in here?

r=me with those comments addressed.
Attachment #380409 - Flags: review?(ted.mielczarek) → review+
Attached patch mobile reftest 1.2 (obsolete) — Splinter Review
updated patch with feedback from dbaron (comments), and ted.  This should make bug 479518 obsolete as well since we are touching all the default timeout code in reftest.js
Attachment #380409 - Attachment is obsolete: true
Attachment #382143 - Flags: review?(ted.mielczarek)
Comment on attachment 382143 [details] [diff] [review]
mobile reftest 1.2

+  try:
+    if (int(options.timeout)):
+      prefsFile.write("""user_pref("reftest.timeout", """ + options.timeout + ");")
+  except:
+    pass

You don't need those extra parens in the if statement. Also, int() will throw a ValueError exception if it's not an int, so I don't think this is exactly what you want to do. Finally, the .write would be better written like:
prefsFile.write('user_pref("reftest.timeout, %d);' % options.timeout)

+  parser.add_option("--timeout",              
+                    action = "store", dest = "timeout",     

Should tack a type="int" in here, like the option below it, then you don't have to worry about the int bit above.

r=me with those fixes.
Attachment #382143 - Flags: review?(ted.mielczarek) → review+
Attached patch mobile reftest 1.3 (obsolete) — Splinter Review
updated with requested changes.  tested command line handling (positive/negative) cases as well as the propagation of the new timeout value to the tests by setting it low and verifying timeout errors.  Done on linux in m-c.
Attachment #382143 - Attachment is obsolete: true
Attachment #382560 - Flags: review?(ted.mielczarek)
Keywords: checkin-needed
Comment on attachment 382560 [details] [diff] [review]
mobile reftest 1.3

Nits:

>diff --git a/layout/tools/reftest/reftest.js b/layout/tools/reftest/reftest.js

>+      gLoadTimeout = 5 * 60 * 1000; //5 minues as per bug 479518

"// 5 minutes" !

>diff --git a/layout/tools/reftest/runreftest.py b/layout/tools/reftest/runreftest.py

>+                    help = "reftest will timeout in specified #milliseconds, [default %default ms]")

What does "#milliseconds" mean (as a help)?
Does not seem to be in 'checkin-needed' state...
Keywords: checkin-needed
Attached patch mobile reftest 1.4 (obsolete) — Splinter Review
updated patch with grammatically correct comments and more descriptive help
Attachment #383923 - Flags: review?(sgautherie.bz)
Attachment #382560 - Attachment is obsolete: true
Attachment #382560 - Flags: review?(ted.mielczarek)
Comment on attachment 383923 [details] [diff] [review]
mobile reftest 1.4

"//5 minues" still there...

I'm no reviewer, continue to ask Ted.
Attachment #383923 - Flags: review?(sgautherie.bz)
forgot the other comment area.  this is fixed now, thanks sgautherie
Attachment #383923 - Attachment is obsolete: true
Attachment #383929 - Flags: review?(ctalbert)
Doesn't need further review if you fixed the things I asked for.
Attachment #383929 - Flags: review?(ctalbert) → review+
Comment on attachment 383929 [details] [diff] [review]
mobile reftest 1.5
[Checkin: See comment 28 & 30+31]

Just carrying forward Ted's r+.
Keywords: checkin-needed
Comment on attachment 383929 [details] [diff] [review]
mobile reftest 1.5
[Checkin: See comment 28 & 30+31]


http://hg.mozilla.org/mozilla-central/rev/d4fa46306138
Attachment #383929 - Attachment description: mobile reftest 1.5 → mobile reftest 1.5 [Checkin: Comment 28]
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Whiteboard: [needs 1.9.1 landing]
Target Milestone: --- → mozilla1.9.2a1
Comment on attachment 383929 [details] [diff] [review]
mobile reftest 1.5
[Checkin: See comment 28 & 30+31]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/81b0a1ff403c
Attachment #383929 - Attachment description: mobile reftest 1.5 [Checkin: Comment 28] → mobile reftest 1.5 [Checkin: Comment 28 & 30]
Comment on attachment 383929 [details] [diff] [review]
mobile reftest 1.5
[Checkin: See comment 28 & 30+31]


(In reply to comment #30)
> http://hg.mozilla.org/releases/mozilla-1.9.1/rev/81b0a1ff403c

"(1.9.1: without bug 481911 part)"
Attachment #383929 - Attachment description: mobile reftest 1.5 [Checkin: Comment 28 & 30] → mobile reftest 1.5 [Checkin: See comment 28 & 30+31]
Whiteboard: [needs 1.9.1 landing] → [fixed1.9.1rc2+]
Whiteboard: [fixed1.9.1rc2+] → [fixed1.9.1rc3]
This fixes a few reftest regressions from the previous patch on this bug:
 * it puts the XPCOMABI object back on the xulRuntime object rather than directly in the sandbox
 * it ensures that that XPCOMABI object is always a string (though empty in some cases)
 * it fixes widgetToolkit to be defined correctly; the patch incorrectly changed xr.widgetToolkit to xr.widgettoolkit
Attachment #407828 - Flags: review?(ted.mielczarek)
Attachment #407828 - Flags: review?(ted.mielczarek) → review+
Depends on: 524666
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: